{"title":"基于偏最小二乘回归和主成分回归的多元建模分析","authors":"Yulei Chen, Xinwei Zhang, Qi Zou, Hepeng Wang, Shisheng Huang, Liang Lu","doi":"10.1145/3523286.3524584","DOIUrl":null,"url":null,"abstract":"In view of the high dimensionality of data in many fields and the serious multiple correlation between variables, this paper proposes an interpretable partial least square regression (PLSR) modeling method. Compared with principal component regression (PCR), when there are a large number of predictors, both PLSR and PCR model the response variables, and the predictors are highly correlated or even collinear. Both of these methods construct new predictors (called components) as linear combinations of the original predictors, but they construct these components in different ways. We use a series of cross-validation experiments to determine the number of components. This paper explores the effectiveness of the above-mentioned two methods. According to the mean square prediction error curve, when the number of components in PLSR is 3 and PCR is 4, better prediction accuracy is obtained.","PeriodicalId":268165,"journal":{"name":"2022 2nd International Conference on Bioinformatics and Intelligent Computing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multivariate Modeling Analysis Based on Partial Least Squares Regression and Principal Component Regression\",\"authors\":\"Yulei Chen, Xinwei Zhang, Qi Zou, Hepeng Wang, Shisheng Huang, Liang Lu\",\"doi\":\"10.1145/3523286.3524584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In view of the high dimensionality of data in many fields and the serious multiple correlation between variables, this paper proposes an interpretable partial least square regression (PLSR) modeling method. Compared with principal component regression (PCR), when there are a large number of predictors, both PLSR and PCR model the response variables, and the predictors are highly correlated or even collinear. Both of these methods construct new predictors (called components) as linear combinations of the original predictors, but they construct these components in different ways. We use a series of cross-validation experiments to determine the number of components. This paper explores the effectiveness of the above-mentioned two methods. According to the mean square prediction error curve, when the number of components in PLSR is 3 and PCR is 4, better prediction accuracy is obtained.\",\"PeriodicalId\":268165,\"journal\":{\"name\":\"2022 2nd International Conference on Bioinformatics and Intelligent Computing\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 2nd International Conference on Bioinformatics and Intelligent Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3523286.3524584\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd International Conference on Bioinformatics and Intelligent Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3523286.3524584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multivariate Modeling Analysis Based on Partial Least Squares Regression and Principal Component Regression
In view of the high dimensionality of data in many fields and the serious multiple correlation between variables, this paper proposes an interpretable partial least square regression (PLSR) modeling method. Compared with principal component regression (PCR), when there are a large number of predictors, both PLSR and PCR model the response variables, and the predictors are highly correlated or even collinear. Both of these methods construct new predictors (called components) as linear combinations of the original predictors, but they construct these components in different ways. We use a series of cross-validation experiments to determine the number of components. This paper explores the effectiveness of the above-mentioned two methods. According to the mean square prediction error curve, when the number of components in PLSR is 3 and PCR is 4, better prediction accuracy is obtained.