{"title":"On spiked eigenvalues of a renormalized sample covariance matrix from multi-population","authors":"Weiming Li, Zeng Li, Junpeng Zhu","doi":"arxiv-2409.08715","DOIUrl":null,"url":null,"abstract":"Sample covariance matrices from multi-population typically exhibit several\nlarge spiked eigenvalues, which stem from differences between population means\nand are crucial for inference on the underlying data structure. This paper\ninvestigates the asymptotic properties of spiked eigenvalues of a renormalized\nsample covariance matrices from multi-population in the ultrahigh dimensional\ncontext where the dimension-to-sample size ratio p/n go to infinity. The first-\nand second-order convergence of these spikes are established based on\nasymptotic properties of three types of sesquilinear forms from\nmulti-population. These findings are further applied to two scenarios,including\ndetermination of total number of subgroups and a new criterion for evaluating\nclustering results in the absence of true labels. Additionally, we provide a\nunified framework with p/n->c\\in (0,\\infty] that integrates the asymptotic\nresults in both high and ultrahigh dimensional settings.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sample covariance matrices from multi-population typically exhibit several
large spiked eigenvalues, which stem from differences between population means
and are crucial for inference on the underlying data structure. This paper
investigates the asymptotic properties of spiked eigenvalues of a renormalized
sample covariance matrices from multi-population in the ultrahigh dimensional
context where the dimension-to-sample size ratio p/n go to infinity. The first-
and second-order convergence of these spikes are established based on
asymptotic properties of three types of sesquilinear forms from
multi-population. These findings are further applied to two scenarios,including
determination of total number of subgroups and a new criterion for evaluating
clustering results in the absence of true labels. Additionally, we provide a
unified framework with p/n->c\in (0,\infty] that integrates the asymptotic
results in both high and ultrahigh dimensional settings.