{"title":"Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference F-type test","authors":"Tianming Zhu, Pengfei Wang, Jin-Ting Zhang","doi":"10.1007/s00180-023-01433-6","DOIUrl":null,"url":null,"abstract":"<p>The problem of testing the equality of mean vectors for high-dimensional data has been intensively investigated in the literature. However, most of the existing tests impose strong assumptions on the underlying group covariance matrices which may not be satisfied or hardly be checked in practice. In this article, an <i>F</i>-type test for two-sample Behrens–Fisher problems for high-dimensional data is proposed and studied. When the two samples are normally distributed and when the null hypothesis is valid, the proposed <i>F</i>-type test statistic is shown to be an <i>F</i>-type mixture, a ratio of two independent <span>\\(\\chi ^2\\)</span>-type mixtures. Under some regularity conditions and the null hypothesis, it is shown that the proposed <i>F</i>-type test statistic and the above <i>F</i>-type mixture have the same normal and non-normal limits. It is then justified to approximate the null distribution of the proposed <i>F</i>-type test statistic by that of the <i>F</i>-type mixture, resulting in the so-called normal reference <i>F</i>-type test. Since the <i>F</i>-type mixture is a ratio of two independent <span>\\(\\chi ^2\\)</span>-type mixtures, we employ the Welch–Satterthwaite <span>\\(\\chi ^2\\)</span>-approximation to the distributions of the numerator and the denominator of the <i>F</i>-type mixture respectively, resulting in an approximation <i>F</i>-distribution whose degrees of freedom can be consistently estimated from the data. The asymptotic power of the proposed <i>F</i>-type test is established. Two simulation studies are conducted and they show that in terms of size control, the proposed <i>F</i>-type test outperforms two existing competitors. The good performance of the proposed <i>F</i>-type test is also illustrated by a COVID-19 data example.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"18 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s00180-023-01433-6","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
The problem of testing the equality of mean vectors for high-dimensional data has been intensively investigated in the literature. However, most of the existing tests impose strong assumptions on the underlying group covariance matrices which may not be satisfied or hardly be checked in practice. In this article, an F-type test for two-sample Behrens–Fisher problems for high-dimensional data is proposed and studied. When the two samples are normally distributed and when the null hypothesis is valid, the proposed F-type test statistic is shown to be an F-type mixture, a ratio of two independent \(\chi ^2\)-type mixtures. Under some regularity conditions and the null hypothesis, it is shown that the proposed F-type test statistic and the above F-type mixture have the same normal and non-normal limits. It is then justified to approximate the null distribution of the proposed F-type test statistic by that of the F-type mixture, resulting in the so-called normal reference F-type test. Since the F-type mixture is a ratio of two independent \(\chi ^2\)-type mixtures, we employ the Welch–Satterthwaite \(\chi ^2\)-approximation to the distributions of the numerator and the denominator of the F-type mixture respectively, resulting in an approximation F-distribution whose degrees of freedom can be consistently estimated from the data. The asymptotic power of the proposed F-type test is established. Two simulation studies are conducted and they show that in terms of size control, the proposed F-type test outperforms two existing competitors. The good performance of the proposed F-type test is also illustrated by a COVID-19 data example.
期刊介绍:
Computational Statistics (CompStat) is an international journal which promotes the publication of applications and methodological research in the field of Computational Statistics. The focus of papers in CompStat is on the contribution to and influence of computing on statistics and vice versa. The journal provides a forum for computer scientists, mathematicians, and statisticians in a variety of fields of statistics such as biometrics, econometrics, data analysis, graphics, simulation, algorithms, knowledge based systems, and Bayesian computing. CompStat publishes hardware, software plus package reports.