{"title":"基于模型的可行主成分分析:秩和误差协方差矩阵的联合估计","authors":"Tak-Shing T. Chan, Alex Gibberd","doi":"10.1016/j.csda.2024.108042","DOIUrl":null,"url":null,"abstract":"<div><p>Real-world inputs to principal component analysis are often corrupted by temporally or spatially correlated errors. There are several methods to mitigate this, e.g., generalized least-square matrix decomposition and maximum likelihood approaches; however, they all require that the number of components or the error covariances to be known in advance, rendering the methods infeasible. To address this issue, a novel method is developed which estimates the number of components and the error covariances at the same time. The method is based on working covariance models, an idea adapted from generalized estimating equations, where the user only specifies the structural form of the error covariances. If the structural form is also unknown, working covariance selection can be used to search for the best structure from a user-defined library. Experiments on synthetic and real data confirm the efficacy of the proposed approach.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"201 ","pages":"Article 108042"},"PeriodicalIF":1.5000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947324001269/pdfft?md5=ac444320856de4406b797dc038c23d54&pid=1-s2.0-S0167947324001269-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Feasible model-based principal component analysis: Joint estimation of rank and error covariance matrix\",\"authors\":\"Tak-Shing T. Chan, Alex Gibberd\",\"doi\":\"10.1016/j.csda.2024.108042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Real-world inputs to principal component analysis are often corrupted by temporally or spatially correlated errors. There are several methods to mitigate this, e.g., generalized least-square matrix decomposition and maximum likelihood approaches; however, they all require that the number of components or the error covariances to be known in advance, rendering the methods infeasible. To address this issue, a novel method is developed which estimates the number of components and the error covariances at the same time. The method is based on working covariance models, an idea adapted from generalized estimating equations, where the user only specifies the structural form of the error covariances. If the structural form is also unknown, working covariance selection can be used to search for the best structure from a user-defined library. Experiments on synthetic and real data confirm the efficacy of the proposed approach.</p></div>\",\"PeriodicalId\":55225,\"journal\":{\"name\":\"Computational Statistics & Data Analysis\",\"volume\":\"201 \",\"pages\":\"Article 108042\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0167947324001269/pdfft?md5=ac444320856de4406b797dc038c23d54&pid=1-s2.0-S0167947324001269-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Statistics & Data Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167947324001269\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics & Data Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167947324001269","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Feasible model-based principal component analysis: Joint estimation of rank and error covariance matrix
Real-world inputs to principal component analysis are often corrupted by temporally or spatially correlated errors. There are several methods to mitigate this, e.g., generalized least-square matrix decomposition and maximum likelihood approaches; however, they all require that the number of components or the error covariances to be known in advance, rendering the methods infeasible. To address this issue, a novel method is developed which estimates the number of components and the error covariances at the same time. The method is based on working covariance models, an idea adapted from generalized estimating equations, where the user only specifies the structural form of the error covariances. If the structural form is also unknown, working covariance selection can be used to search for the best structure from a user-defined library. Experiments on synthetic and real data confirm the efficacy of the proposed approach.
期刊介绍:
Computational Statistics and Data Analysis (CSDA), an Official Publication of the network Computational and Methodological Statistics (CMStatistics) and of the International Association for Statistical Computing (IASC), is an international journal dedicated to the dissemination of methodological research and applications in the areas of computational statistics and data analysis. The journal consists of four refereed sections which are divided into the following subject areas:
I) Computational Statistics - Manuscripts dealing with: 1) the explicit impact of computers on statistical methodology (e.g., Bayesian computing, bioinformatics,computer graphics, computer intensive inferential methods, data exploration, data mining, expert systems, heuristics, knowledge based systems, machine learning, neural networks, numerical and optimization methods, parallel computing, statistical databases, statistical systems), and 2) the development, evaluation and validation of statistical software and algorithms. Software and algorithms can be submitted with manuscripts and will be stored together with the online article.
II) Statistical Methodology for Data Analysis - Manuscripts dealing with novel and original data analytical strategies and methodologies applied in biostatistics (design and analytic methods for clinical trials, epidemiological studies, statistical genetics, or genetic/environmental interactions), chemometrics, classification, data exploration, density estimation, design of experiments, environmetrics, education, image analysis, marketing, model free data exploration, pattern recognition, psychometrics, statistical physics, image processing, robust procedures.
[...]
III) Special Applications - [...]
IV) Annals of Statistical Data Science [...]