超越奈曼-皮尔逊：E 值可通过数据驱动的阿尔法进行假设检验。

IF 9.4 1区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Proceedings of the National Academy of Sciences of the United States of America Pub Date : 2024-09-20 DOI:10.1073/pnas.2302098121

Peter D Grünwald

{"title":"超越奈曼-皮尔逊：E 值可通过数据驱动的阿尔法进行假设检验。","authors":"Peter D Grünwald","doi":"10.1073/pnas.2302098121","DOIUrl":null,"url":null,"abstract":"A standard practice in statistical hypothesis testing is to mention the P-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With P-values, it is not clear how to use an extreme observation (e.g. [Formula: see text]) for getting better frequentist decisions. With e-values it is straightforward, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post hoc, after observation of the data-thereby providing a handle on \"roving [Formula: see text]'s.\" When Type-II risks are taken into consideration, the only admissible decision rules in the post hoc setting turn out to be e-value-based. Similarly, if the loss incurred when specifying a faulty confidence interval is not fixed in advance, standard confidence intervals and distributions may fail, whereas e-confidence sets and e-posteriors still provide valid risk guarantees. Sufficiently powerful e-values have by now been developed for a range of classical testing problems. We discuss the main challenges for wider development and deployment.","PeriodicalId":20548,"journal":{"name":"Proceedings of the National Academy of Sciences of the United States of America","volume":null,"pages":null},"PeriodicalIF":9.4000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Beyond Neyman-Pearson: E-values enable hypothesis testing with a data-driven alpha.\",\"authors\":\"Peter D Grünwald\",\"doi\":\"10.1073/pnas.2302098121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A standard practice in statistical hypothesis testing is to mention the P-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With P-values, it is not clear how to use an extreme observation (e.g. [Formula: see text]) for getting better frequentist decisions. With e-values it is straightforward, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post hoc, after observation of the data-thereby providing a handle on \\\"roving [Formula: see text]'s.\\\" When Type-II risks are taken into consideration, the only admissible decision rules in the post hoc setting turn out to be e-value-based. Similarly, if the loss incurred when specifying a faulty confidence interval is not fixed in advance, standard confidence intervals and distributions may fail, whereas e-confidence sets and e-posteriors still provide valid risk guarantees. Sufficiently powerful e-values have by now been developed for a range of classical testing problems. We discuss the main challenges for wider development and deployment.\",\"PeriodicalId\":20548,\"journal\":{\"name\":\"Proceedings of the National Academy of Sciences of the United States of America\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2024-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the National Academy of Sciences of the United States of America\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1073/pnas.2302098121\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the National Academy of Sciences of the United States of America","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1073/pnas.2302098121","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

统计假设检验的标准做法是在做出接受/拒绝决定的同时提及 P 值。我们将展示提及 e 值的优势。对于 P 值，如何使用极端观测值（如[公式：见正文]）来获得更好的频数决策并不清楚。而使用 e 值则简单明了，因为 e 值在广义的奈曼-皮尔逊（Neyman-Pearson）设置中提供了第一类风险控制，其决策任务（一般损失函数）是在观察数据后临时确定的，因此提供了对 "巡回[公式：见正文]"的处理方法。当考虑到第二类风险时，事后设置中唯一可接受的决策规则就变成了基于电子值的决策规则。同样，如果指定一个错误的置信区间所造成的损失没有预先确定，那么标准置信区间和分布可能会失效，而电子置信集和电子阶后值仍能提供有效的风险保证。现在，我们已经为一系列经典测试问题开发出了足够强大的电子值。我们将讨论更广泛的开发和应用所面临的主要挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Beyond Neyman-Pearson: E-values enable hypothesis testing with a data-driven alpha.

A standard practice in statistical hypothesis testing is to mention the P-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With P-values, it is not clear how to use an extreme observation (e.g. [Formula: see text]) for getting better frequentist decisions. With e-values it is straightforward, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post hoc, after observation of the data-thereby providing a handle on "roving [Formula: see text]'s." When Type-II risks are taken into consideration, the only admissible decision rules in the post hoc setting turn out to be e-value-based. Similarly, if the loss incurred when specifying a faulty confidence interval is not fixed in advance, standard confidence intervals and distributions may fail, whereas e-confidence sets and e-posteriors still provide valid risk guarantees. Sufficiently powerful e-values have by now been developed for a range of classical testing problems. We discuss the main challenges for wider development and deployment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the National Academy of Sciences of the United States of America 综合性期刊-综合性期刊

CiteScore

19.00

自引率

0.90%

发文量

3575

审稿时长

2.5 months

期刊介绍： The Proceedings of the National Academy of Sciences (PNAS), a peer-reviewed journal of the National Academy of Sciences (NAS), serves as an authoritative source for high-impact, original research across the biological, physical, and social sciences. With a global scope, the journal welcomes submissions from researchers worldwide, making it an inclusive platform for advancing scientific knowledge.