{"title":"一种直观、高效的大数据l-范数主成分分析算法","authors":"Xiaowei Song","doi":"10.1109/CISS.2019.8692807","DOIUrl":null,"url":null,"abstract":"Grassmann average (GA) can coincide with Ll- norm principal component (PC) and is scalable for millions of samples. However, it is unclear whether there exists and how much further speed improvement can be gained by revising the fixed-point optimization-based GA. In this paper, I analyze such optimization process in an intuitive way and propose its improvement, i.e., an online algorithm without any iterations. I show that it can be most efficient in the sense that it only visits each sample once per PC, with minimal memory requirement, unlike GA or MATLAB svds. It is proved to be convergent for big data.","PeriodicalId":123696,"journal":{"name":"2019 53rd Annual Conference on Information Sciences and Systems (CISS)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An intuitive and most efficient Ll-norm principal component analysis algorithm for big data\",\"authors\":\"Xiaowei Song\",\"doi\":\"10.1109/CISS.2019.8692807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Grassmann average (GA) can coincide with Ll- norm principal component (PC) and is scalable for millions of samples. However, it is unclear whether there exists and how much further speed improvement can be gained by revising the fixed-point optimization-based GA. In this paper, I analyze such optimization process in an intuitive way and propose its improvement, i.e., an online algorithm without any iterations. I show that it can be most efficient in the sense that it only visits each sample once per PC, with minimal memory requirement, unlike GA or MATLAB svds. It is proved to be convergent for big data.\",\"PeriodicalId\":123696,\"journal\":{\"name\":\"2019 53rd Annual Conference on Information Sciences and Systems (CISS)\",\"volume\":\"132 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 53rd Annual Conference on Information Sciences and Systems (CISS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISS.2019.8692807\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 53rd Annual Conference on Information Sciences and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS.2019.8692807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An intuitive and most efficient Ll-norm principal component analysis algorithm for big data
Grassmann average (GA) can coincide with Ll- norm principal component (PC) and is scalable for millions of samples. However, it is unclear whether there exists and how much further speed improvement can be gained by revising the fixed-point optimization-based GA. In this paper, I analyze such optimization process in an intuitive way and propose its improvement, i.e., an online algorithm without any iterations. I show that it can be most efficient in the sense that it only visits each sample once per PC, with minimal memory requirement, unlike GA or MATLAB svds. It is proved to be convergent for big data.