主成分分析中检验最小特征值相等性所需的校正因子和样本量的比较

Q3 Mathematics
Eduard Gañan-Cardenas, J. C. Correa-Morales
{"title":"主成分分析中检验最小特征值相等性所需的校正因子和样本量的比较","authors":"Eduard Gañan-Cardenas, J. C. Correa-Morales","doi":"10.15446/RCE.V44N1.83987","DOIUrl":null,"url":null,"abstract":"In the inferential process of Principal Component Analysis (PCA), one of the main challenges for researchers is establishing the correct number of components to represent the sample. For that purpose, heuristic and statistical strategies have been proposed. One statistical approach consists in testing the hypothesis of the equality of the smallest eigenvalues in the covariance or correlation matrix using a Likelihood-Ratio Test (LRT) that follows a χ2 limit distribution. Different correction factors have been proposed to improve the approximation of the sampling distribution of the statistic. We use simulation to study the significance level and power of the test under the use of these different factors and analyze the sample size required for an dequate approximation. The results indicate that for covariance matrix, the factor proposed by Bartlett offers the best balance between the objectives of low probability of Type I Error and high Power.\n \n \n \n \n \n \nIf the correlation matrix is used, the factors W ∗\n \n \n \n \n \n \nand cχ2\n \n \n \n \n \n \nare the most\n \n \n \n \n \n \n \nrecommended. Empirically, we can observe that most factors require sample sizes 10 or 20 times the number of variables if covariance or correlationmatrices, respectively, are implemented.","PeriodicalId":54477,"journal":{"name":"Revista Colombiana De Estadistica","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Comparison of Correction Factors and Sample Size Required to Test the Equality of the Smallest Eigenvalues in Principal Component Analysis\",\"authors\":\"Eduard Gañan-Cardenas, J. C. Correa-Morales\",\"doi\":\"10.15446/RCE.V44N1.83987\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the inferential process of Principal Component Analysis (PCA), one of the main challenges for researchers is establishing the correct number of components to represent the sample. For that purpose, heuristic and statistical strategies have been proposed. One statistical approach consists in testing the hypothesis of the equality of the smallest eigenvalues in the covariance or correlation matrix using a Likelihood-Ratio Test (LRT) that follows a χ2 limit distribution. Different correction factors have been proposed to improve the approximation of the sampling distribution of the statistic. We use simulation to study the significance level and power of the test under the use of these different factors and analyze the sample size required for an dequate approximation. The results indicate that for covariance matrix, the factor proposed by Bartlett offers the best balance between the objectives of low probability of Type I Error and high Power.\\n \\n \\n \\n \\n \\n \\nIf the correlation matrix is used, the factors W ∗\\n \\n \\n \\n \\n \\n \\nand cχ2\\n \\n \\n \\n \\n \\n \\nare the most\\n \\n \\n \\n \\n \\n \\n \\nrecommended. Empirically, we can observe that most factors require sample sizes 10 or 20 times the number of variables if covariance or correlationmatrices, respectively, are implemented.\",\"PeriodicalId\":54477,\"journal\":{\"name\":\"Revista Colombiana De Estadistica\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista Colombiana De Estadistica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15446/RCE.V44N1.83987\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Colombiana De Estadistica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15446/RCE.V44N1.83987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 3

摘要

在主成分分析(PCA)的推理过程中,研究人员面临的主要挑战之一是建立代表样本的正确分量数。为此,提出了启发式和统计策略。一种统计方法包括使用遵循χ2极限分布的似然比检验(LRT)检验协方差或相关矩阵中最小特征值相等的假设。提出了不同的校正因子来改善统计量抽样分布的近似值。我们使用模拟来研究在使用这些不同因素下测试的显著性水平和功率,并分析充分近似所需的样本量。结果表明,对于协方差矩阵,Bartlett提出的因子在低类型错误概率和高功率目标之间提供了最佳平衡。如果使用相关矩阵,因子W *和cχ2是最推荐的。根据经验,我们可以观察到,如果分别实现协方差或相关矩阵,大多数因素需要的样本量是变量数量的10或20倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparison of Correction Factors and Sample Size Required to Test the Equality of the Smallest Eigenvalues in Principal Component Analysis
In the inferential process of Principal Component Analysis (PCA), one of the main challenges for researchers is establishing the correct number of components to represent the sample. For that purpose, heuristic and statistical strategies have been proposed. One statistical approach consists in testing the hypothesis of the equality of the smallest eigenvalues in the covariance or correlation matrix using a Likelihood-Ratio Test (LRT) that follows a χ2 limit distribution. Different correction factors have been proposed to improve the approximation of the sampling distribution of the statistic. We use simulation to study the significance level and power of the test under the use of these different factors and analyze the sample size required for an dequate approximation. The results indicate that for covariance matrix, the factor proposed by Bartlett offers the best balance between the objectives of low probability of Type I Error and high Power.             If the correlation matrix is used, the factors W ∗             and cχ2             are the most               recommended. Empirically, we can observe that most factors require sample sizes 10 or 20 times the number of variables if covariance or correlationmatrices, respectively, are implemented.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Revista Colombiana De Estadistica
Revista Colombiana De Estadistica STATISTICS & PROBABILITY-
CiteScore
1.20
自引率
0.00%
发文量
0
审稿时长
>12 weeks
期刊介绍: The Colombian Journal of Statistics publishes original articles of theoretical, methodological and educational kind in any branch of Statistics. Purely theoretical papers should include illustration of the techniques presented with real data or at least simulation experiments in order to verify the usefulness of the contents presented. Informative articles of high quality methodologies or statistical techniques applied in different fields of knowledge are also considered. Only articles in English language are considered for publication. The Editorial Committee assumes that the works submitted for evaluation have not been previously published and are not being given simultaneously for publication elsewhere, and will not be without prior consent of the Committee, unless, as a result of the assessment, decides not publish in the journal. It is further assumed that when the authors deliver a document for publication in the Colombian Journal of Statistics, they know the above conditions and agree with them.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信