{"title":"应用于微生物组数据分析的高维组合数据的功率增强型双样本均值检验","authors":"Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu","doi":"arxiv-2405.02551","DOIUrl":null,"url":null,"abstract":"Testing differences in mean vectors is a fundamental task in the analysis of\nhigh-dimensional compositional data. Existing methods may suffer from low power\nif the underlying signal pattern is in a situation that does not favor the\ndeployed test. In this work, we develop two-sample power-enhanced mean tests\nfor high-dimensional compositional data based on the combination of $p$-values,\nwhich integrates strengths from two popular types of tests: the maximum-type\ntest and the quadratic-type test. We provide rigorous theoretical guarantees on\nthe proposed tests, showing accurate Type-I error rate control and enhanced\ntesting power. Our method boosts the testing power towards a broader\nalternative space, which yields robust performance across a wide range of\nsignal pattern settings. Our theory also contributes to the literature on power\nenhancement and Gaussian approximation for high-dimensional hypothesis testing.\nWe demonstrate the performance of our method on both simulated data and\nreal-world microbiome data, showing that our proposed approach improves the\ntesting power substantially compared to existing methods.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Power-Enhanced Two-Sample Mean Tests for High-Dimensional Compositional Data with Application to Microbiome Data Analysis\",\"authors\":\"Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu\",\"doi\":\"arxiv-2405.02551\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Testing differences in mean vectors is a fundamental task in the analysis of\\nhigh-dimensional compositional data. Existing methods may suffer from low power\\nif the underlying signal pattern is in a situation that does not favor the\\ndeployed test. In this work, we develop two-sample power-enhanced mean tests\\nfor high-dimensional compositional data based on the combination of $p$-values,\\nwhich integrates strengths from two popular types of tests: the maximum-type\\ntest and the quadratic-type test. We provide rigorous theoretical guarantees on\\nthe proposed tests, showing accurate Type-I error rate control and enhanced\\ntesting power. Our method boosts the testing power towards a broader\\nalternative space, which yields robust performance across a wide range of\\nsignal pattern settings. Our theory also contributes to the literature on power\\nenhancement and Gaussian approximation for high-dimensional hypothesis testing.\\nWe demonstrate the performance of our method on both simulated data and\\nreal-world microbiome data, showing that our proposed approach improves the\\ntesting power substantially compared to existing methods.\",\"PeriodicalId\":501330,\"journal\":{\"name\":\"arXiv - MATH - Statistics Theory\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.02551\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.02551","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Power-Enhanced Two-Sample Mean Tests for High-Dimensional Compositional Data with Application to Microbiome Data Analysis
Testing differences in mean vectors is a fundamental task in the analysis of
high-dimensional compositional data. Existing methods may suffer from low power
if the underlying signal pattern is in a situation that does not favor the
deployed test. In this work, we develop two-sample power-enhanced mean tests
for high-dimensional compositional data based on the combination of $p$-values,
which integrates strengths from two popular types of tests: the maximum-type
test and the quadratic-type test. We provide rigorous theoretical guarantees on
the proposed tests, showing accurate Type-I error rate control and enhanced
testing power. Our method boosts the testing power towards a broader
alternative space, which yields robust performance across a wide range of
signal pattern settings. Our theory also contributes to the literature on power
enhancement and Gaussian approximation for high-dimensional hypothesis testing.
We demonstrate the performance of our method on both simulated data and
real-world microbiome data, showing that our proposed approach improves the
testing power substantially compared to existing methods.