一种直观、高效的大数据l-范数主成分分析算法

Xiaowei Song
{"title":"一种直观、高效的大数据l-范数主成分分析算法","authors":"Xiaowei Song","doi":"10.1109/CISS.2019.8692807","DOIUrl":null,"url":null,"abstract":"Grassmann average (GA) can coincide with Ll- norm principal component (PC) and is scalable for millions of samples. However, it is unclear whether there exists and how much further speed improvement can be gained by revising the fixed-point optimization-based GA. In this paper, I analyze such optimization process in an intuitive way and propose its improvement, i.e., an online algorithm without any iterations. I show that it can be most efficient in the sense that it only visits each sample once per PC, with minimal memory requirement, unlike GA or MATLAB svds. It is proved to be convergent for big data.","PeriodicalId":123696,"journal":{"name":"2019 53rd Annual Conference on Information Sciences and Systems (CISS)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An intuitive and most efficient Ll-norm principal component analysis algorithm for big data\",\"authors\":\"Xiaowei Song\",\"doi\":\"10.1109/CISS.2019.8692807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Grassmann average (GA) can coincide with Ll- norm principal component (PC) and is scalable for millions of samples. However, it is unclear whether there exists and how much further speed improvement can be gained by revising the fixed-point optimization-based GA. In this paper, I analyze such optimization process in an intuitive way and propose its improvement, i.e., an online algorithm without any iterations. I show that it can be most efficient in the sense that it only visits each sample once per PC, with minimal memory requirement, unlike GA or MATLAB svds. It is proved to be convergent for big data.\",\"PeriodicalId\":123696,\"journal\":{\"name\":\"2019 53rd Annual Conference on Information Sciences and Systems (CISS)\",\"volume\":\"132 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 53rd Annual Conference on Information Sciences and Systems (CISS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISS.2019.8692807\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 53rd Annual Conference on Information Sciences and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS.2019.8692807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

格拉斯曼平均(GA)可以与l范数主成分(PC)重合,并且在数百万个样本中具有可扩展性。然而,目前尚不清楚是否存在,以及通过修改基于定点优化的遗传算法可以获得多少进一步的速度提高。在本文中,我对这一优化过程进行了直观的分析,并提出了改进方案,即一种不需要任何迭代的在线算法。我表明,它可以是最有效的,因为它每台PC只访问每个样本一次,内存需求最小,不像GA或MATLAB svds。事实证明,对于大数据,它是收敛的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An intuitive and most efficient Ll-norm principal component analysis algorithm for big data
Grassmann average (GA) can coincide with Ll- norm principal component (PC) and is scalable for millions of samples. However, it is unclear whether there exists and how much further speed improvement can be gained by revising the fixed-point optimization-based GA. In this paper, I analyze such optimization process in an intuitive way and propose its improvement, i.e., an online algorithm without any iterations. I show that it can be most efficient in the sense that it only visits each sample once per PC, with minimal memory requirement, unlike GA or MATLAB svds. It is proved to be convergent for big data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信