{"title":"用怀疑p $$ p $$值评估可复制性:I型误差控制和样本量计划","authors":"Charlotte Micheloud, F. Balabdaoui, L. Held","doi":"10.1111/stan.12312","DOIUrl":null,"url":null,"abstract":"We study a statistical framework for replicability based on a recently proposed quantitative measure of replication success, the sceptical p$$ p $$ ‐value. A recalibration is proposed to obtain exact overall Type‐I error control if the effect is null in both studies and additional bounds on the partial and conditional Type‐I error rate, which represent the case where only one study has a null effect. The approach avoids the double dichotomization for significance of the two‐trials rule and has larger project power to detect existing effects over both studies in combination. It can also be used for power calculations and requires a smaller replication sample size than the two‐trials rule for already convincing original studies. We illustrate the performance of the proposed methodology in an application to data from the Experimental Economics Replication Project.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"77 1","pages":"573 - 591"},"PeriodicalIF":1.4000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Assessing replicability with the sceptical p$$ p $$ ‐value: Type‐I error control and sample size planning\",\"authors\":\"Charlotte Micheloud, F. Balabdaoui, L. Held\",\"doi\":\"10.1111/stan.12312\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study a statistical framework for replicability based on a recently proposed quantitative measure of replication success, the sceptical p$$ p $$ ‐value. A recalibration is proposed to obtain exact overall Type‐I error control if the effect is null in both studies and additional bounds on the partial and conditional Type‐I error rate, which represent the case where only one study has a null effect. The approach avoids the double dichotomization for significance of the two‐trials rule and has larger project power to detect existing effects over both studies in combination. It can also be used for power calculations and requires a smaller replication sample size than the two‐trials rule for already convincing original studies. We illustrate the performance of the proposed methodology in an application to data from the Experimental Economics Replication Project.\",\"PeriodicalId\":51178,\"journal\":{\"name\":\"Statistica Neerlandica\",\"volume\":\"77 1\",\"pages\":\"573 - 591\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistica Neerlandica\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1111/stan.12312\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistica Neerlandica","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/stan.12312","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 3
摘要
我们研究了一个统计框架的可复制性基于最近提出的复制成功的定量测量,怀疑p $$ p $$‐值。如果两项研究的影响为零,以及部分和条件型I错误率的附加界限,则建议重新校准以获得精确的总体型I误差控制,这代表了只有一项研究具有零效应的情况。该方法避免了两次试验规则显著性的双重二分法,并且具有更大的项目能力来检测两项研究合并后的现有效应。它也可以用于功率计算,并且需要比已经令人信服的原始研究的两次试验规则更小的复制样本量。我们在实验经济学复制项目的数据应用中说明了所提出方法的性能。
Assessing replicability with the sceptical p$$ p $$ ‐value: Type‐I error control and sample size planning
We study a statistical framework for replicability based on a recently proposed quantitative measure of replication success, the sceptical p$$ p $$ ‐value. A recalibration is proposed to obtain exact overall Type‐I error control if the effect is null in both studies and additional bounds on the partial and conditional Type‐I error rate, which represent the case where only one study has a null effect. The approach avoids the double dichotomization for significance of the two‐trials rule and has larger project power to detect existing effects over both studies in combination. It can also be used for power calculations and requires a smaller replication sample size than the two‐trials rule for already convincing original studies. We illustrate the performance of the proposed methodology in an application to data from the Experimental Economics Replication Project.
期刊介绍:
Statistica Neerlandica has been the journal of the Netherlands Society for Statistics and Operations Research since 1946. It covers all areas of statistics, from theoretical to applied, with a special emphasis on mathematical statistics, statistics for the behavioural sciences and biostatistics. This wide scope is reflected by the expertise of the journal’s editors representing these areas. The diverse editorial board is committed to a fast and fair reviewing process, and will judge submissions on quality, correctness, relevance and originality. Statistica Neerlandica encourages transparency and reproducibility, and offers online resources to make data, code, simulation results and other additional materials publicly available.