Controlling the false discovery rate in transformational sparsity: Split Knockoffs

IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY
Yang Cao, Xinwei Sun, Yuan Yao
{"title":"Controlling the false discovery rate in transformational sparsity: Split Knockoffs","authors":"Yang Cao, Xinwei Sun, Yuan Yao","doi":"10.1093/jrsssb/qkad126","DOIUrl":null,"url":null,"abstract":"Abstract Controlling the False Discovery Rate (FDR) in a variable selection procedure is critical for reproducible discoveries, and it has been extensively studied in sparse linear models. However, it remains largely open in scenarios where the sparsity constraint is not directly imposed on the parameters but on a linear transformation of the parameters to be estimated. Examples of such scenarios include total variations, wavelet transforms, fused LASSO, and trend filtering. In this paper, we propose a data-adaptive FDR control method, called the Split Knockoff method, for this transformational sparsity setting. The proposed method exploits both variable and data splitting. The linear transformation constraint is relaxed to its Euclidean proximity in a lifted parameter space, which yields an orthogonal design that enables the orthogonal Split Knockoff construction. To overcome the challenge that exchangeability fails due to the heterogeneous noise brought by the transformation, new inverse supermartingale structures are developed via data splitting for provable FDR control without sacrificing power. Simulation experiments demonstrate that the proposed methodology achieves the desired FDR and power. We also provide an application to Alzheimer’s Disease study, where atrophy brain regions and their abnormal connections can be discovered based on a structural Magnetic Resonance Imaging dataset.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series B-Statistical Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jrsssb/qkad126","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 2

Abstract

Abstract Controlling the False Discovery Rate (FDR) in a variable selection procedure is critical for reproducible discoveries, and it has been extensively studied in sparse linear models. However, it remains largely open in scenarios where the sparsity constraint is not directly imposed on the parameters but on a linear transformation of the parameters to be estimated. Examples of such scenarios include total variations, wavelet transforms, fused LASSO, and trend filtering. In this paper, we propose a data-adaptive FDR control method, called the Split Knockoff method, for this transformational sparsity setting. The proposed method exploits both variable and data splitting. The linear transformation constraint is relaxed to its Euclidean proximity in a lifted parameter space, which yields an orthogonal design that enables the orthogonal Split Knockoff construction. To overcome the challenge that exchangeability fails due to the heterogeneous noise brought by the transformation, new inverse supermartingale structures are developed via data splitting for provable FDR control without sacrificing power. Simulation experiments demonstrate that the proposed methodology achieves the desired FDR and power. We also provide an application to Alzheimer’s Disease study, where atrophy brain regions and their abnormal connections can be discovered based on a structural Magnetic Resonance Imaging dataset.
转换稀疏性中的错误发现率控制:拆分仿冒
控制变量选择过程中的错误发现率(FDR)是可重复发现的关键,在稀疏线性模型中得到了广泛的研究。然而,在稀疏性约束不是直接施加在参数上,而是施加在待估计参数的线性变换上的情况下,它仍然很大程度上是开放的。这些场景的示例包括总变化、小波变换、融合LASSO和趋势过滤。在本文中,我们提出了一种数据自适应的FDR控制方法,称为分裂仿造方法,用于这种转换稀疏性设置。该方法同时利用了变量和数据分割。线性变换约束被放宽到其在提升参数空间中的欧几里得接近性,从而产生正交设计,使正交分裂仿造结构成为可能。为了克服变换带来的非均质噪声导致互换性失效的挑战,在不牺牲功率的情况下,通过数据分割开发了新的逆上鞅结构,用于可证明的FDR控制。仿真实验表明,该方法达到了预期的FDR和功率。我们还提供了一个应用程序,以阿尔茨海默病的研究,其中萎缩的大脑区域和他们的异常连接可以发现基于结构磁共振成像数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.80
自引率
0.00%
发文量
83
审稿时长
>12 weeks
期刊介绍: Series B (Statistical Methodology) aims to publish high quality papers on the methodological aspects of statistics and data science more broadly. The objective of papers should be to contribute to the understanding of statistical methodology and/or to develop and improve statistical methods; any mathematical theory should be directed towards these aims. The kinds of contribution considered include descriptions of new methods of collecting or analysing data, with the underlying theory, an indication of the scope of application and preferably a real example. Also considered are comparisons, critical evaluations and new applications of existing methods, contributions to probability theory which have a clear practical bearing (including the formulation and analysis of stochastic models), statistical computation or simulation where original methodology is involved and original contributions to the foundations of statistical science. Reviews of methodological techniques are also considered. A paper, even if correct and well presented, is likely to be rejected if it only presents straightforward special cases of previously published work, if it is of mathematical interest only, if it is too long in relation to the importance of the new material that it contains or if it is dominated by computations or simulations of a routine nature.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信