Knockoff procedure for false discovery rate control in high-dimensional data streams.

IF 1.2 4区 数学 Q2 STATISTICS & PROBABILITY
Journal of Applied Statistics Pub Date : 2023-05-15 eCollection Date: 2023-01-01 DOI:10.1080/02664763.2023.2200496
Ka Wai Tsang, Fugee Tsung, Zhihao Xu
{"title":"Knockoff procedure for false discovery rate control in high-dimensional data streams.","authors":"Ka Wai Tsang, Fugee Tsung, Zhihao Xu","doi":"10.1080/02664763.2023.2200496","DOIUrl":null,"url":null,"abstract":"<p><p>Motivated by applications to root-cause identification of faults in high-dimensional data streams that may have very limited samples after faults are detected, we consider multiple testing in models for multivariate statistical process control (SPC). With quick fault detection, only small portion of data streams being out-of-control (OC) can be assumed. It is a long standing problem to identify those OC data streams while controlling the number of false discoveries. It is challenging due to the limited number of OC samples after the termination of the process when faults are detected. Although several false discovery rate (FDR) controlling methods have been proposed, people may prefer other methods for quick detection. With a recently developed method called Knockoff filtering, we propose a knockoff procedure that can combine with other fault detection methods in the sense that the knockoff procedure does not change the stopping time, but may identify another set of faults to control FDR. A theorem for the FDR control of the proposed procedure is provided. Simulation studies show that the proposed procedure can control FDR while maintaining high power. We also illustrate the performance in an application to semiconductor manufacturing processes that motivated this development.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"50 14","pages":"2970-2983"},"PeriodicalIF":1.2000,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10557548/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/02664763.2023.2200496","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Motivated by applications to root-cause identification of faults in high-dimensional data streams that may have very limited samples after faults are detected, we consider multiple testing in models for multivariate statistical process control (SPC). With quick fault detection, only small portion of data streams being out-of-control (OC) can be assumed. It is a long standing problem to identify those OC data streams while controlling the number of false discoveries. It is challenging due to the limited number of OC samples after the termination of the process when faults are detected. Although several false discovery rate (FDR) controlling methods have been proposed, people may prefer other methods for quick detection. With a recently developed method called Knockoff filtering, we propose a knockoff procedure that can combine with other fault detection methods in the sense that the knockoff procedure does not change the stopping time, but may identify another set of faults to control FDR. A theorem for the FDR control of the proposed procedure is provided. Simulation studies show that the proposed procedure can control FDR while maintaining high power. We also illustrate the performance in an application to semiconductor manufacturing processes that motivated this development.

用于高维数据流中错误发现率控制的Knockoff过程。
由于应用于高维数据流中故障的根本原因识别,在检测到故障后样本可能非常有限,我们考虑在多变量统计过程控制(SPC)的模型中进行多重测试。在快速故障检测的情况下,可以假设只有一小部分数据流失控(OC)。识别这些OC数据流同时控制错误发现的数量是一个长期存在的问题。这是具有挑战性的,因为当检测到故障时,过程终止后OC样本的数量有限。尽管已经提出了几种错误发现率(FDR)控制方法,但人们可能更喜欢其他快速检测方法。通过最近开发的一种称为Knockoff滤波的方法,我们提出了一种可以与其他故障检测方法相结合的Knockoff过程,因为Knockoff程序不会改变停止时间,但可以识别另一组故障来控制FDR。给出了所提出程序的FDR控制的一个定理。仿真研究表明,该方法可以在保持高功率的同时控制FDR。我们还说明了推动这一发展的半导体制造工艺应用的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Applied Statistics
Journal of Applied Statistics 数学-统计学与概率论
CiteScore
3.40
自引率
0.00%
发文量
126
审稿时长
6 months
期刊介绍: Journal of Applied Statistics provides a forum for communication between both applied statisticians and users of applied statistical techniques across a wide range of disciplines. These areas include business, computing, economics, ecology, education, management, medicine, operational research and sociology, but papers from other areas are also considered. The editorial policy is to publish rigorous but clear and accessible papers on applied techniques. Purely theoretical papers are avoided but those on theoretical developments which clearly demonstrate significant applied potential are welcomed. Each paper is submitted to at least two independent referees.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信