{"title":"高维数据的随机稳健主成分分析","authors":"M. Rahmani, George K. Atia","doi":"10.1109/DSP-SPE.2015.7369522","DOIUrl":null,"url":null,"abstract":"Robust Principal Component Analysis (PCA) (or robust subspace recovery) is a particularly important problem in unsupervised learning pertaining to a broad range of applications. In this paper, we analyze a randomized robust subspace recovery algorithm to show that its complexity is independent of the size of the data matrix. Exploiting the intrinsic low-dimensional geometry of the low rank matrix, the big data matrix is first turned to smaller size compressed data. This is accomplished by selecting a small random subset of the columns of the given data matrix, which is then projected into a random low-dimensional subspace. In the next step, a convex robust PCA algorithm is applied to the compressed data to learn the columns subspace of the low rank matrix. We derive new sufficient conditions, which show that the number of linear observations and the complexity of the randomized algorithm do not depend on the size of the given data.","PeriodicalId":91992,"journal":{"name":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","volume":"75 1","pages":"25-30"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Analysis of randomized robust PCA for high dimensional data\",\"authors\":\"M. Rahmani, George K. Atia\",\"doi\":\"10.1109/DSP-SPE.2015.7369522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Robust Principal Component Analysis (PCA) (or robust subspace recovery) is a particularly important problem in unsupervised learning pertaining to a broad range of applications. In this paper, we analyze a randomized robust subspace recovery algorithm to show that its complexity is independent of the size of the data matrix. Exploiting the intrinsic low-dimensional geometry of the low rank matrix, the big data matrix is first turned to smaller size compressed data. This is accomplished by selecting a small random subset of the columns of the given data matrix, which is then projected into a random low-dimensional subspace. In the next step, a convex robust PCA algorithm is applied to the compressed data to learn the columns subspace of the low rank matrix. We derive new sufficient conditions, which show that the number of linear observations and the complexity of the randomized algorithm do not depend on the size of the given data.\",\"PeriodicalId\":91992,\"journal\":{\"name\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"volume\":\"75 1\",\"pages\":\"25-30\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSP-SPE.2015.7369522\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSP-SPE.2015.7369522","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analysis of randomized robust PCA for high dimensional data
Robust Principal Component Analysis (PCA) (or robust subspace recovery) is a particularly important problem in unsupervised learning pertaining to a broad range of applications. In this paper, we analyze a randomized robust subspace recovery algorithm to show that its complexity is independent of the size of the data matrix. Exploiting the intrinsic low-dimensional geometry of the low rank matrix, the big data matrix is first turned to smaller size compressed data. This is accomplished by selecting a small random subset of the columns of the given data matrix, which is then projected into a random low-dimensional subspace. In the next step, a convex robust PCA algorithm is applied to the compressed data to learn the columns subspace of the low rank matrix. We derive new sufficient conditions, which show that the number of linear observations and the complexity of the randomized algorithm do not depend on the size of the given data.