Revisiting the Large n (Sample Size) Problem: How to Avert Spurious Significance Results

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS
Stats Pub Date : 2023-12-05 DOI:10.3390/stats6040081
Aris Spanos
{"title":"Revisiting the Large n (Sample Size) Problem: How to Avert Spurious Significance Results","authors":"Aris Spanos","doi":"10.3390/stats6040081","DOIUrl":null,"url":null,"abstract":"Although large data sets are generally viewed as advantageous for their ability to provide more precise and reliable evidence, it is often overlooked that these benefits are contingent upon certain conditions being met. The primary condition is the approximate validity (statistical adequacy) of the probabilistic assumptions comprising the statistical model Mθ(x) applied to the data. In the case of a statistically adequate Mθ(x) and a given significance level α, as n increases, the power of a test increases, and the p-value decreases due to the inherent trade-off between type I and type II error probabilities in frequentist testing. This trade-off raises concerns about the reliability of declaring ‘statistical significance’ based on conventional significance levels when n is exceptionally large. To address this issue, the author proposes that a principled approach, in the form of post-data severity (SEV) evaluation, be employed. The SEV evaluation represents a post-data error probability that converts unduly data-specific ‘accept/reject H0 results’ into evidence either supporting or contradicting inferential claims regarding the parameters of interest. This approach offers a more nuanced and robust perspective in navigating the challenges posed by the large n problem.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"68 2","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stats","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/stats6040081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Although large data sets are generally viewed as advantageous for their ability to provide more precise and reliable evidence, it is often overlooked that these benefits are contingent upon certain conditions being met. The primary condition is the approximate validity (statistical adequacy) of the probabilistic assumptions comprising the statistical model Mθ(x) applied to the data. In the case of a statistically adequate Mθ(x) and a given significance level α, as n increases, the power of a test increases, and the p-value decreases due to the inherent trade-off between type I and type II error probabilities in frequentist testing. This trade-off raises concerns about the reliability of declaring ‘statistical significance’ based on conventional significance levels when n is exceptionally large. To address this issue, the author proposes that a principled approach, in the form of post-data severity (SEV) evaluation, be employed. The SEV evaluation represents a post-data error probability that converts unduly data-specific ‘accept/reject H0 results’ into evidence either supporting or contradicting inferential claims regarding the parameters of interest. This approach offers a more nuanced and robust perspective in navigating the challenges posed by the large n problem.
重新审视大 n(样本量)问题:如何避免虚假显著性结果
虽然大数据集通常被认为是有利的,因为它们能够提供更精确和可靠的证据,但往往被忽视的是,这些好处取决于满足某些条件。主要条件是构成应用于数据的统计模型Mθ(x)的概率假设的近似有效性(统计充分性)。在统计上足够的Mθ(x)和给定显著性水平α的情况下,随着n的增加,测试的功率增加,并且由于频率测试中I型和II型错误概率之间的固有权衡,p值降低。当n特别大时,这种权衡引起了人们对基于传统显著性水平宣布“统计显著性”的可靠性的担忧。为了解决这个问题,作者建议采用一种有原则的方法,以数据严重性(SEV)评估的形式进行评估。SEV评估代表了数据后的错误概率,它将过度特定于数据的“接受/拒绝H0结果”转换为支持或反对有关感兴趣参数的推论主张的证据。这种方法为应对大n问题带来的挑战提供了更细致、更可靠的视角。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
0.60
自引率
0.00%
发文量
0
审稿时长
7 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信