显著性检验和p值变异性的陷阱:计量经济学视角

IF 15.4 Q1 STATISTICS & PROBABILITY

Statistics Surveys Pub Date : 2018-01-01 DOI:10.1214/18-SS122

N. Hirschauer, Sven Grüner, O. Musshoff, C. Becker

{"title":"显著性检验和p值变异性的陷阱:计量经济学视角","authors":"N. Hirschauer, Sven Grüner, O. Musshoff, C. Becker","doi":"10.1214/18-SS122","DOIUrl":null,"url":null,"abstract":"Data on how many scientific findings are reproducible are generally bleak and a wealth of papers have warned against misuses of the p-value and resulting false findings in recent years. This paper discusses the question of what we can(not) learn from the p-value, which is still widely considered as the gold standard of statistical validity. We aim to provide a non-technical and easily accessible resource for statistical practitioners who wish to spot and avoid misinterpretations and misuses of statistical significance tests. For this purpose, we first classify and describe the most widely discussed (“classical”) pitfalls of significance testing, and review published work on these misuses with a focus on regression-based “confirmatory” study. This includes a description of the single-study bias and a simulation-based illustration of how proper meta-analysis compares to misleading significance counts (“vote counting”). Going beyond the classical pitfalls, we also use simulation to provide intuition that relying on the statistical estimate “p-value” as a measure of evidence without considering its sample-to-sample variability falls short of the mark even within an otherwise appropriate interpretation. We conclude with a discussion of the","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"44 1","pages":"136-172"},"PeriodicalIF":15.4000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Pitfalls of significance testing and $p$-value variability: An econometrics perspective\",\"authors\":\"N. Hirschauer, Sven Grüner, O. Musshoff, C. Becker\",\"doi\":\"10.1214/18-SS122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data on how many scientific findings are reproducible are generally bleak and a wealth of papers have warned against misuses of the p-value and resulting false findings in recent years. This paper discusses the question of what we can(not) learn from the p-value, which is still widely considered as the gold standard of statistical validity. We aim to provide a non-technical and easily accessible resource for statistical practitioners who wish to spot and avoid misinterpretations and misuses of statistical significance tests. For this purpose, we first classify and describe the most widely discussed (“classical”) pitfalls of significance testing, and review published work on these misuses with a focus on regression-based “confirmatory” study. This includes a description of the single-study bias and a simulation-based illustration of how proper meta-analysis compares to misleading significance counts (“vote counting”). Going beyond the classical pitfalls, we also use simulation to provide intuition that relying on the statistical estimate “p-value” as a measure of evidence without considering its sample-to-sample variability falls short of the mark even within an otherwise appropriate interpretation. We conclude with a discussion of the\",\"PeriodicalId\":46627,\"journal\":{\"name\":\"Statistics Surveys\",\"volume\":\"44 1\",\"pages\":\"136-172\"},\"PeriodicalIF\":15.4000,\"publicationDate\":\"2018-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics Surveys\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1214/18-SS122\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics Surveys","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1214/18-SS122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 10

摘要

关于有多少科学发现是可重复的数据通常是黯淡的，近年来，大量的论文警告人们不要滥用p值，从而导致错误的发现。本文讨论了我们能从p值中学到什么(不能)的问题，p值仍然被广泛认为是统计效度的金标准。我们的目标是为希望发现和避免误解和误用统计显著性检验的统计从业人员提供一个非技术和易于获取的资源。为此，我们首先对最广泛讨论的(“经典”)显著性检验陷阱进行分类和描述，并回顾关于这些误用的已发表的工作，重点关注基于回归的“验证性”研究。这包括对单一研究偏差的描述和基于模拟的说明，说明如何将适当的元分析与误导性的显著性计数(“计票”)进行比较。除了经典的陷阱之外，我们还使用模拟来提供直觉，即依赖于统计估计“p值”作为证据的度量，而不考虑其样本间的可变性，即使在其他适当的解释中也达不到要求。最后，我们将讨论

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pitfalls of significance testing and $p$-value variability: An econometrics perspective

Data on how many scientific findings are reproducible are generally bleak and a wealth of papers have warned against misuses of the p-value and resulting false findings in recent years. This paper discusses the question of what we can(not) learn from the p-value, which is still widely considered as the gold standard of statistical validity. We aim to provide a non-technical and easily accessible resource for statistical practitioners who wish to spot and avoid misinterpretations and misuses of statistical significance tests. For this purpose, we first classify and describe the most widely discussed (“classical”) pitfalls of significance testing, and review published work on these misuses with a focus on regression-based “confirmatory” study. This includes a description of the single-study bias and a simulation-based illustration of how proper meta-analysis compares to misleading significance counts (“vote counting”). Going beyond the classical pitfalls, we also use simulation to provide intuition that relying on the statistical estimate “p-value” as a measure of evidence without considering its sample-to-sample variability falls short of the mark even within an otherwise appropriate interpretation. We conclude with a discussion of the

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistics Surveys STATISTICS & PROBABILITY-

CiteScore

11.70

自引率

0.00%

发文量

期刊介绍： Statistics Surveys publishes survey articles in theoretical, computational, and applied statistics. The style of articles may range from reviews of recent research to graduate textbook exposition. Articles may be broad or narrow in scope. The essential requirements are a well specified topic and target audience, together with clear exposition. Statistics Surveys is sponsored by the American Statistical Association, the Bernoulli Society, the Institute of Mathematical Statistics, and by the Statistical Society of Canada.