On spiked eigenvalues of a renormalized sample covariance matrix from multi-population

Weiming Li, Zeng Li, Junpeng Zhu
{"title":"On spiked eigenvalues of a renormalized sample covariance matrix from multi-population","authors":"Weiming Li, Zeng Li, Junpeng Zhu","doi":"arxiv-2409.08715","DOIUrl":null,"url":null,"abstract":"Sample covariance matrices from multi-population typically exhibit several\nlarge spiked eigenvalues, which stem from differences between population means\nand are crucial for inference on the underlying data structure. This paper\ninvestigates the asymptotic properties of spiked eigenvalues of a renormalized\nsample covariance matrices from multi-population in the ultrahigh dimensional\ncontext where the dimension-to-sample size ratio p/n go to infinity. The first-\nand second-order convergence of these spikes are established based on\nasymptotic properties of three types of sesquilinear forms from\nmulti-population. These findings are further applied to two scenarios,including\ndetermination of total number of subgroups and a new criterion for evaluating\nclustering results in the absence of true labels. Additionally, we provide a\nunified framework with p/n->c\\in (0,\\infty] that integrates the asymptotic\nresults in both high and ultrahigh dimensional settings.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Sample covariance matrices from multi-population typically exhibit several large spiked eigenvalues, which stem from differences between population means and are crucial for inference on the underlying data structure. This paper investigates the asymptotic properties of spiked eigenvalues of a renormalized sample covariance matrices from multi-population in the ultrahigh dimensional context where the dimension-to-sample size ratio p/n go to infinity. The first- and second-order convergence of these spikes are established based on asymptotic properties of three types of sesquilinear forms from multi-population. These findings are further applied to two scenarios,including determination of total number of subgroups and a new criterion for evaluating clustering results in the absence of true labels. Additionally, we provide a unified framework with p/n->c\in (0,\infty] that integrates the asymptotic results in both high and ultrahigh dimensional settings.
关于来自多人群的重归一化样本协方差矩阵的尖峰特征值
来自多种群的样本协方差矩阵通常会表现出几个巨大的尖峰特征值,这些特征值源于种群均值之间的差异,对于推断底层数据结构至关重要。本文研究了在维数与样本大小比 p/n 为无穷大的超高维背景下,多种群重归一化样本协方差矩阵尖峰特征值的渐近特性。这些尖峰的一阶收敛性和二阶收敛性是基于来自多群体的三类芝麻线性形式的渐近特性建立起来的。这些发现被进一步应用于两种情况,包括子群总数的确定和在没有真实标签的情况下评估聚类结果的新标准。此外,我们还提供了一个统一的 p/n->c\in (0,\infty]框架,它整合了高维和超高维设置下的渐近结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信