{"title":"P-Value Precision and Reproducibility.","authors":"Dennis D Boos, Leonard A Stefanski","doi":"10.1198/tas.2011.10129","DOIUrl":null,"url":null,"abstract":"<p><p>P-values are useful statistical measures of evidence against a null hypothesis. In contrast to other statistical estimates, however, their sample-to-sample variability is usually not considered or estimated, and therefore not fully appreciated. Via a systematic study of log-scale p-value standard errors, bootstrap prediction bounds, and reproducibility probabilities for future replicate p-values, we show that p-values exhibit surprisingly large variability in typical data situations. In addition to providing context to discussions about the failure of statistical results to replicate, our findings shed light on the relative value of exact p-values vis-a-vis approximate p-values, and indicate that the use of *, **, and *** to denote levels .05, .01, and .001 of statistical significance in subject-matter journals is about the right level of precision for reporting p-values when judged by widely accepted rules for rounding statistical estimates.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"65 4","pages":"213-221"},"PeriodicalIF":1.8000,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1198/tas.2011.10129","citationCount":"178","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Statistician","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1198/tas.2011.10129","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/1/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 178
Abstract
P-values are useful statistical measures of evidence against a null hypothesis. In contrast to other statistical estimates, however, their sample-to-sample variability is usually not considered or estimated, and therefore not fully appreciated. Via a systematic study of log-scale p-value standard errors, bootstrap prediction bounds, and reproducibility probabilities for future replicate p-values, we show that p-values exhibit surprisingly large variability in typical data situations. In addition to providing context to discussions about the failure of statistical results to replicate, our findings shed light on the relative value of exact p-values vis-a-vis approximate p-values, and indicate that the use of *, **, and *** to denote levels .05, .01, and .001 of statistical significance in subject-matter journals is about the right level of precision for reporting p-values when judged by widely accepted rules for rounding statistical estimates.
期刊介绍:
Are you looking for general-interest articles about current national and international statistical problems and programs; interesting and fun articles of a general nature about statistics and its applications; or the teaching of statistics? Then you are looking for The American Statistician (TAS), published quarterly by the American Statistical Association. TAS contains timely articles organized into the following sections: Statistical Practice, General, Teacher''s Corner, History Corner, Interdisciplinary, Statistical Computing and Graphics, Reviews of Books and Teaching Materials, and Letters to the Editor.