{"title":"Robust Covariance Matrix Estimation for High-Dimensional Compositional Data with Application to Sales Data Analysis.","authors":"Danning Li, Arun Srinivasan, Qian Chen, Lingzhou Xue","doi":"10.1080/07350015.2022.2106990","DOIUrl":null,"url":null,"abstract":"<p><p>Compositional data arises in a wide variety of research areas when some form of standardization and composition is necessary. Estimating covariance matrices is of fundamental importance for high-dimensional compositional data analysis. However, existing methods require the restrictive Gaussian or sub-Gaussian assumption, which may not hold in practice. We propose a robust composition adjusted thresholding covariance procedure based on Huber-type M-estimation to estimate the sparse covariance structure of high-dimensional compositional data. We introduce a cross-validation procedure to choose the tuning parameters of the proposed method. Theoretically, by assuming a bounded fourth moment condition, we obtain the rates of convergence and signal recovery property for the proposed method and provide the theoretical guarantees for the cross-validation procedure under the high-dimensional setting. Numerically, we demonstrate the effectiveness of the proposed method in simulation studies and also a real application to sales data analysis.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10730115/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/07350015.2022.2106990","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/9/21 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
Compositional data arises in a wide variety of research areas when some form of standardization and composition is necessary. Estimating covariance matrices is of fundamental importance for high-dimensional compositional data analysis. However, existing methods require the restrictive Gaussian or sub-Gaussian assumption, which may not hold in practice. We propose a robust composition adjusted thresholding covariance procedure based on Huber-type M-estimation to estimate the sparse covariance structure of high-dimensional compositional data. We introduce a cross-validation procedure to choose the tuning parameters of the proposed method. Theoretically, by assuming a bounded fourth moment condition, we obtain the rates of convergence and signal recovery property for the proposed method and provide the theoretical guarantees for the cross-validation procedure under the high-dimensional setting. Numerically, we demonstrate the effectiveness of the proposed method in simulation studies and also a real application to sales data analysis.