Edoardo Saccenti, Marieke E. Timmerman, José Camacho
{"title":"A Simulation Study of the Effects of Additive, Multiplicative, Correlated, and Uncorrelated Errors on Principal Component Analysis","authors":"Edoardo Saccenti, Marieke E. Timmerman, José Camacho","doi":"10.1002/cem.3595","DOIUrl":null,"url":null,"abstract":"<p>Measurement errors are ubiquitous in all experimental sciences. Depending on the particular experimental platform used to acquire data, different types of errors are introduced, amounting to an admixture of additive and multiplicative error components that can be uncorrelated or correlated. In this paper, we investigate the effect of different types of experimental error on the recovery of the subspace with principal component analysis (PCA) using numerical simulations. Specifically, we assessed how different error characteristics (variance, correlation, and correlation structure), loading structures, and data distributions influence the accuracy to estimate an error-free (true) subspace from sampled data with PCA. Quality was assessed in terms of the mean squared reconstruction error and the congruence to the error-free loadings, using the pseudorank and adjusting for rotational ambiguity. Analysis of variance reveals that the error variance, error correlation structure, and their interaction with the loading structure are the factors mostly affecting quality of loading estimation from sampled data. We advocate for the need to characterize and assess the nature of measurement error and the need to adapt formulations of PCA that can explicitly take into account error structures in the model fitting.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 12","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3595","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3595","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
引用次数: 0
Abstract
Measurement errors are ubiquitous in all experimental sciences. Depending on the particular experimental platform used to acquire data, different types of errors are introduced, amounting to an admixture of additive and multiplicative error components that can be uncorrelated or correlated. In this paper, we investigate the effect of different types of experimental error on the recovery of the subspace with principal component analysis (PCA) using numerical simulations. Specifically, we assessed how different error characteristics (variance, correlation, and correlation structure), loading structures, and data distributions influence the accuracy to estimate an error-free (true) subspace from sampled data with PCA. Quality was assessed in terms of the mean squared reconstruction error and the congruence to the error-free loadings, using the pseudorank and adjusting for rotational ambiguity. Analysis of variance reveals that the error variance, error correlation structure, and their interaction with the loading structure are the factors mostly affecting quality of loading estimation from sampled data. We advocate for the need to characterize and assess the nature of measurement error and the need to adapt formulations of PCA that can explicitly take into account error structures in the model fitting.
期刊介绍:
The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.