{"title":"Power-Enhanced Two-Sample Mean Tests for High-Dimensional Compositional Data with Application to Microbiome Data Analysis","authors":"Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu","doi":"arxiv-2405.02551","DOIUrl":null,"url":null,"abstract":"Testing differences in mean vectors is a fundamental task in the analysis of\nhigh-dimensional compositional data. Existing methods may suffer from low power\nif the underlying signal pattern is in a situation that does not favor the\ndeployed test. In this work, we develop two-sample power-enhanced mean tests\nfor high-dimensional compositional data based on the combination of $p$-values,\nwhich integrates strengths from two popular types of tests: the maximum-type\ntest and the quadratic-type test. We provide rigorous theoretical guarantees on\nthe proposed tests, showing accurate Type-I error rate control and enhanced\ntesting power. Our method boosts the testing power towards a broader\nalternative space, which yields robust performance across a wide range of\nsignal pattern settings. Our theory also contributes to the literature on power\nenhancement and Gaussian approximation for high-dimensional hypothesis testing.\nWe demonstrate the performance of our method on both simulated data and\nreal-world microbiome data, showing that our proposed approach improves the\ntesting power substantially compared to existing methods.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.02551","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Testing differences in mean vectors is a fundamental task in the analysis of
high-dimensional compositional data. Existing methods may suffer from low power
if the underlying signal pattern is in a situation that does not favor the
deployed test. In this work, we develop two-sample power-enhanced mean tests
for high-dimensional compositional data based on the combination of $p$-values,
which integrates strengths from two popular types of tests: the maximum-type
test and the quadratic-type test. We provide rigorous theoretical guarantees on
the proposed tests, showing accurate Type-I error rate control and enhanced
testing power. Our method boosts the testing power towards a broader
alternative space, which yields robust performance across a wide range of
signal pattern settings. Our theory also contributes to the literature on power
enhancement and Gaussian approximation for high-dimensional hypothesis testing.
We demonstrate the performance of our method on both simulated data and
real-world microbiome data, showing that our proposed approach improves the
testing power substantially compared to existing methods.