{"title":"Large‐scale covariate‐assisted two‐sample inference under dependence","authors":"Pengfei Wang, Wensheng Zhu","doi":"10.1111/sjos.12608","DOIUrl":null,"url":null,"abstract":"The problems of large‐scale two‐sample inference often arise from the statistical analysis of “high throughput\" data. Conventional multiple testing procedures usually suffer from loss of testing efficiency when conducting two‐sample t$$ t $$ ‐tests directly. To some extent, this is because of the ignorance of sparsity information. Moreover, the two‐sample tests commonly have local correlations, and neglecting the dependence structure may decrease the statistical accuracy. Therefore, it is imperative to develop a procedure that considers both sparsity information and dependence structure among the tests. We start by introducing a novel dependence model to allow for sparsity information and dependence structure. Based on the dependence model, we propose a covariate‐assisted local index of significance (COALIS)$$ \\left(\\mathbf{COALIS}\\right) $$ procedure and show that it is valid and optimal. Then a data‐driven procedure is developed to mimic the oracle procedure. Both simulations and real data analysis show that the COALIS procedure outperforms its competitors.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"49 1","pages":"1421 - 1447"},"PeriodicalIF":0.8000,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scandinavian Journal of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/sjos.12608","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
The problems of large‐scale two‐sample inference often arise from the statistical analysis of “high throughput" data. Conventional multiple testing procedures usually suffer from loss of testing efficiency when conducting two‐sample t$$ t $$ ‐tests directly. To some extent, this is because of the ignorance of sparsity information. Moreover, the two‐sample tests commonly have local correlations, and neglecting the dependence structure may decrease the statistical accuracy. Therefore, it is imperative to develop a procedure that considers both sparsity information and dependence structure among the tests. We start by introducing a novel dependence model to allow for sparsity information and dependence structure. Based on the dependence model, we propose a covariate‐assisted local index of significance (COALIS)$$ \left(\mathbf{COALIS}\right) $$ procedure and show that it is valid and optimal. Then a data‐driven procedure is developed to mimic the oracle procedure. Both simulations and real data analysis show that the COALIS procedure outperforms its competitors.
期刊介绍:
The Scandinavian Journal of Statistics is internationally recognised as one of the leading statistical journals in the world. It was founded in 1974 by four Scandinavian statistical societies. Today more than eighty per cent of the manuscripts are submitted from outside Scandinavia.
It is an international journal devoted to reporting significant and innovative original contributions to statistical methodology, both theory and applications.
The journal specializes in statistical modelling showing particular appreciation of the underlying substantive research problems.
The emergence of specialized methods for analysing longitudinal and spatial data is just one example of an area of important methodological development in which the Scandinavian Journal of Statistics has a particular niche.