{"title":"截断样本协方差矩阵的总体频谱","authors":"Subhroshekhar Ghosh, Soumendu Sundar Mukherjee, Himasish Talukdar","doi":"arxiv-2409.02911","DOIUrl":null,"url":null,"abstract":"Determinantal Point Processes (DPPs), which originate from quantum and\nstatistical physics, are known for modelling diversity. Recent research [Ghosh\nand Rigollet (2020)] has demonstrated that certain matrix-valued $U$-statistics\n(that are truncated versions of the usual sample covariance matrix) can\neffectively estimate parameters in the context of Gaussian DPPs and enhance\ndimension reduction techniques, outperforming standard methods like PCA in\nclustering applications. This paper explores the spectral properties of these\nmatrix-valued $U$-statistics in the null setting of an isotropic design. These\nmatrices may be represented as $X L X^\\top$, where $X$ is a data matrix and $L$\nis the Laplacian matrix of a random geometric graph associated to $X$. The main\nmathematically interesting twist here is that the matrix $L$ is dependent on\n$X$. We give complete descriptions of the bulk spectra of these matrix-valued\n$U$-statistics in terms of the Stieltjes transforms of their empirical spectral\nmeasures. The results and the techniques are in fact able to address a broader\nclass of kernelised random matrices, connecting their limiting spectra to\ngeneralised Mar\\v{c}enko-Pastur laws and free probability.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bulk Spectra of Truncated Sample Covariance Matrices\",\"authors\":\"Subhroshekhar Ghosh, Soumendu Sundar Mukherjee, Himasish Talukdar\",\"doi\":\"arxiv-2409.02911\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Determinantal Point Processes (DPPs), which originate from quantum and\\nstatistical physics, are known for modelling diversity. Recent research [Ghosh\\nand Rigollet (2020)] has demonstrated that certain matrix-valued $U$-statistics\\n(that are truncated versions of the usual sample covariance matrix) can\\neffectively estimate parameters in the context of Gaussian DPPs and enhance\\ndimension reduction techniques, outperforming standard methods like PCA in\\nclustering applications. This paper explores the spectral properties of these\\nmatrix-valued $U$-statistics in the null setting of an isotropic design. These\\nmatrices may be represented as $X L X^\\\\top$, where $X$ is a data matrix and $L$\\nis the Laplacian matrix of a random geometric graph associated to $X$. The main\\nmathematically interesting twist here is that the matrix $L$ is dependent on\\n$X$. We give complete descriptions of the bulk spectra of these matrix-valued\\n$U$-statistics in terms of the Stieltjes transforms of their empirical spectral\\nmeasures. The results and the techniques are in fact able to address a broader\\nclass of kernelised random matrices, connecting their limiting spectra to\\ngeneralised Mar\\\\v{c}enko-Pastur laws and free probability.\",\"PeriodicalId\":501379,\"journal\":{\"name\":\"arXiv - STAT - Statistics Theory\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.02911\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Bulk Spectra of Truncated Sample Covariance Matrices
Determinantal Point Processes (DPPs), which originate from quantum and
statistical physics, are known for modelling diversity. Recent research [Ghosh
and Rigollet (2020)] has demonstrated that certain matrix-valued $U$-statistics
(that are truncated versions of the usual sample covariance matrix) can
effectively estimate parameters in the context of Gaussian DPPs and enhance
dimension reduction techniques, outperforming standard methods like PCA in
clustering applications. This paper explores the spectral properties of these
matrix-valued $U$-statistics in the null setting of an isotropic design. These
matrices may be represented as $X L X^\top$, where $X$ is a data matrix and $L$
is the Laplacian matrix of a random geometric graph associated to $X$. The main
mathematically interesting twist here is that the matrix $L$ is dependent on
$X$. We give complete descriptions of the bulk spectra of these matrix-valued
$U$-statistics in terms of the Stieltjes transforms of their empirical spectral
measures. The results and the techniques are in fact able to address a broader
class of kernelised random matrices, connecting their limiting spectra to
generalised Mar\v{c}enko-Pastur laws and free probability.