独立性下τ -审查加权benjami - hochberg程序

IF 2.4 2区 数学 Q2 BIOLOGY
Biometrika Pub Date : 2023-08-02 DOI:10.1093/biomet/asad047
Haibing Zhao, Huijuan Zhou
{"title":"独立性下τ -审查加权benjami - hochberg程序","authors":"Haibing Zhao, Huijuan Zhou","doi":"10.1093/biomet/asad047","DOIUrl":null,"url":null,"abstract":"\n In the field of multiple hypothesis testing, auxiliary information can be leveraged to enhance the efficiency of test procedures. A common way to make use of auxiliary information is by weighting p-values. However, when the weights are learned from data, controlling the finite-sample false discovery rate becomes challenging, and most existing weighted procedures only guarantee false discovery rate control in an asymptotic limit. In a recent study conducted by Ignatiadis & Huber (2021), a novel τ-censored weighted Benjamini-Hochberg procedure was proposed to control the finite-sample false discovery rate. The authors employed the cross-weighting approach to learn weights for the p-values. This approach randomly splits the data into several folds and constructs a weight for each p-value Pi using the p-values outside the fold containing Pi. Cross-weighting does not exploit the p-value information inside the fold and only balances the weights within each fold, which may result in a loss of power. In this article, we introduce two methods for constructing data-driven weights for τ-censored weighted Benjamini-Hochberg procedures under independence. They provide new insight into masking p-values to prevent overfitting in multiple testing. The first method utilizes a leave-one-out technique, where all but one of the p-values are used to learn a weight for each p-value. This technique masks the information of a p-value in its weight by calculating the infimum of the weight with respect to the p-value. The second method uses partial information from each p-value to construct weights and utilizes the conditional distributions of the null p-values to establish false discovery rate control. Additionally, we propose two methods for estimating the null proportion and demonstrate how to integrate null-proportion adaptivity into the proposed weights to improve power.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"τ -censored weighted Benjamini-Hochberg procedures under independence\",\"authors\":\"Haibing Zhao, Huijuan Zhou\",\"doi\":\"10.1093/biomet/asad047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n In the field of multiple hypothesis testing, auxiliary information can be leveraged to enhance the efficiency of test procedures. A common way to make use of auxiliary information is by weighting p-values. However, when the weights are learned from data, controlling the finite-sample false discovery rate becomes challenging, and most existing weighted procedures only guarantee false discovery rate control in an asymptotic limit. In a recent study conducted by Ignatiadis & Huber (2021), a novel τ-censored weighted Benjamini-Hochberg procedure was proposed to control the finite-sample false discovery rate. The authors employed the cross-weighting approach to learn weights for the p-values. This approach randomly splits the data into several folds and constructs a weight for each p-value Pi using the p-values outside the fold containing Pi. Cross-weighting does not exploit the p-value information inside the fold and only balances the weights within each fold, which may result in a loss of power. In this article, we introduce two methods for constructing data-driven weights for τ-censored weighted Benjamini-Hochberg procedures under independence. They provide new insight into masking p-values to prevent overfitting in multiple testing. The first method utilizes a leave-one-out technique, where all but one of the p-values are used to learn a weight for each p-value. This technique masks the information of a p-value in its weight by calculating the infimum of the weight with respect to the p-value. The second method uses partial information from each p-value to construct weights and utilizes the conditional distributions of the null p-values to establish false discovery rate control. Additionally, we propose two methods for estimating the null proportion and demonstrate how to integrate null-proportion adaptivity into the proposed weights to improve power.\",\"PeriodicalId\":9001,\"journal\":{\"name\":\"Biometrika\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2023-08-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrika\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/biomet/asad047\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomet/asad047","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 1

摘要

在多重假设检验领域,可以利用辅助信息来提高检验程序的效率。利用辅助信息的一种常见方式是对p值进行加权。然而,当从数据中学习权重时,控制有限样本的错误发现率变得具有挑战性,并且大多数现有的加权过程仅保证错误发现率控制在渐近极限中。在Ignatidis&Huber(2021)最近进行的一项研究中,提出了一种新的τ-截尾加权Benjamini Hochberg程序来控制有限样本的错误发现率。作者采用交叉加权方法来学习p值的权重。这种方法将数据随机划分为几个折叠,并使用包含Pi的折叠之外的p值为每个p值Pi构建权重。交叉加权不利用折叠内的p值信息,只平衡每个折叠内的权重,这可能导致功率损失。在本文中,我们介绍了两种在独立条件下构造τ-截尾加权Benjamini-Hochberg过程数据驱动权重的方法。它们为屏蔽p值提供了新的见解,以防止多重测试中的过拟合。第一种方法使用留一技术,其中除了一个p值之外的所有p值都用于学习每个p值的权重。该技术通过计算权重相对于p值的下确界来屏蔽其权重中的p值的信息。第二种方法使用来自每个p值的部分信息来构造权重,并利用空p值的条件分布来建立错误发现率控制。此外,我们提出了两种估计零比例的方法,并演示了如何将零比例自适应性集成到所提出的权重中以提高功率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
τ -censored weighted Benjamini-Hochberg procedures under independence
In the field of multiple hypothesis testing, auxiliary information can be leveraged to enhance the efficiency of test procedures. A common way to make use of auxiliary information is by weighting p-values. However, when the weights are learned from data, controlling the finite-sample false discovery rate becomes challenging, and most existing weighted procedures only guarantee false discovery rate control in an asymptotic limit. In a recent study conducted by Ignatiadis & Huber (2021), a novel τ-censored weighted Benjamini-Hochberg procedure was proposed to control the finite-sample false discovery rate. The authors employed the cross-weighting approach to learn weights for the p-values. This approach randomly splits the data into several folds and constructs a weight for each p-value Pi using the p-values outside the fold containing Pi. Cross-weighting does not exploit the p-value information inside the fold and only balances the weights within each fold, which may result in a loss of power. In this article, we introduce two methods for constructing data-driven weights for τ-censored weighted Benjamini-Hochberg procedures under independence. They provide new insight into masking p-values to prevent overfitting in multiple testing. The first method utilizes a leave-one-out technique, where all but one of the p-values are used to learn a weight for each p-value. This technique masks the information of a p-value in its weight by calculating the infimum of the weight with respect to the p-value. The second method uses partial information from each p-value to construct weights and utilizes the conditional distributions of the null p-values to establish false discovery rate control. Additionally, we propose two methods for estimating the null proportion and demonstrate how to integrate null-proportion adaptivity into the proposed weights to improve power.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biometrika
Biometrika 生物-生物学
CiteScore
5.50
自引率
3.70%
发文量
56
审稿时长
6-12 weeks
期刊介绍: Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信