Jump to content

User:On Sober Reflection/p-value

From Wikipedia, the free encyclopedia

In statistical hypothesis testing, the p-value is a measure of the strength of evidence against a hypothesis (called the "null hypothesis") about a larger population of data than the sample of data being tested. It is a number between 0 and 1, measuring the probability, after assuming the null hypothesis is true, that the chance selection of data in the smaller sample would result in the observed results or greater.[1] A small p-value means there is a low probability of obtaining the results by chance when the null hypothesis is true.[2]

For example, in an experiment in which 10 subjects receive a placebo, and another 10 receive an experimental diuretic, a researcher might report the subjects taking the experimental diuretic were outputting 45 milliliters more urine, on average; with the null hypothesis being "there is no difference in output between the two groups" and a p-value of 0.031. This means there is only a 3.1% chance that there is no difference but the random selection

From source 1: Third, it is not true that the p value is the probability that any observed difference is simply attributable to the chance selection of subjects from the target population. The p value is calculated based on an assumption that chance is the only reason for observing any difference. Thus it cannot provide evidence for the truth of that statement.

From p-value talk page: It should be "The p-value is the upper limit (limit supremum) of the probability of obtaining a random sample with a test statistic more extreme (more contradictory to the null hypothesis) than what was observed when the null hypothesis is true."


, for a given statistical model, the worst-case probability that, when the null hypothesis is true, the statistical summary (such as the absolute value of the sample mean difference between two compared groups) would be greater than or equal to the actual observed results.[3] The use of p-values in statistical hypothesis testing is common in many fields of research[4] such as physics, economics, finance, political science, psychology,[5] biology, criminal justice, criminology, and sociology.[6] The misuse of p-values is a controversial topic in metascience.[7]

Italicisation, capitalisation and hyphenation of the term varies. For example, AMA style uses "P value", APA style uses "p value", and the American Statistical Association uses "p-value".[8]

  1. ^ Dorey, Frederick (Aug 2010). "In Brief: The P Value: What Is It and What Does It Tell You?". Clinical Orthopaedics and Related Research: 2297–2298.
  2. ^ Ferreira, Juliana Carvalho; Patino, Cecilia Maria (Sep–Oct 2015). "What does the p value really mean?". Jornal Brasileiro de Pneumologia: 485.{{cite journal}}: CS1 maint: date format (link)
  3. ^ Wasserstein, Ronald L.; Lazar, Nicole A. (7 March 2016). "The ASA's Statement on p-Values: Context, Process, and Purpose". The American Statistician. 70 (2): 129–133. doi:10.1080/00031305.2016.1154108.
  4. ^ Bhattacharya, Bhaskar; Habtzghi, DeSale (2002). "Median of the p value under the alternative hypothesis". The American Statistician. 56 (3): 202–6. doi:10.1198/000313002146.
  5. ^ Wetzels, R.; Matzke, D.; Lee, M. D.; Rouder, J. N.; Iverson, G. J.; Wagenmakers, E. -J. (2011). "Statistical Evidence in Experimental Psychology: An Empirical Comparison Using 855 t Tests". Perspectives on Psychological Science. 6 (3): 291–298. doi:10.1177/1745691611406923. PMID 26168519.
  6. ^ Babbie, E. (2007). The practice of social research 11th ed. Thomson Wadsworth: Belmont, California.
  7. ^ Ioannidis, John P. A.; Ware, Jennifer J.; Wagenmakers, Eric-Jan; Simonsohn, Uri; Chambers, Christopher D.; Button, Katherine S.; Bishop, Dorothy V. M.; Nosek, Brian A.; Munafò, Marcus R. (January 2017). "A manifesto for reproducible science". Nature Human Behaviour. p. 0021. doi:10.1038/s41562-016-0021. Retrieved 9 May 2019.
  8. ^ http://magazine.amstat.org/wp-content/uploads/STATTKadmin/style%5B1%5D.pdf