Adaptively-Accelerated Parallel Stochastic Gradient Descent for High-Dimensional and Incomplete Data Representation Learning

IF 7.5 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Wen Qin;Xin Luo;MengChu Zhou
{"title":"Adaptively-Accelerated Parallel Stochastic Gradient Descent for High-Dimensional and Incomplete Data Representation Learning","authors":"Wen Qin;Xin Luo;MengChu Zhou","doi":"10.1109/TBDATA.2023.3326304","DOIUrl":null,"url":null,"abstract":"High-dimensional and incomplete (HDI) interactions among numerous nodes are commonly encountered in a Big Data-related application, like user-item interactions in a recommender system. Owing to its high efficiency and flexibility, a stochastic gradient descent (SGD) algorithm can enable efficient latent feature analysis (LFA) of HDI data for its precise representation, thereby enabling efficient solutions to knowledge acquisition issues like missing data estimation. However, LFA on HDI data involves a bilinear issue, making SGD-based LFA a sequential process, i.e., the update on a feature can impact the results on the others. Intervening the sequence of SGD-based LFA on HDI data can affect the training results. Therefore, a parallel SGD algorithm to LFA should be designed with care. Existing parallel SGD-based LFA models suffer from a) low parallelization degree, and b) slow convergence, which significantly restrict their scalability. Aiming at addressing these vital issues, this paper presents an \n<underline>A</u>\ndaptively-accelerated \n<underline>P</u>\narallel \n<underline>S</u>\ntochastic \n<underline>G</u>\nradient \n<underline>D</u>\nescent (AP-SGD) algorithm to LFA by: a) establishing a novel local minimum-based data splitting and scheduling scheme to reduce the scheduling cost among threads, thereby achieving high parallelization degree; and b) incorporating the adaptive momentum method into the learning scheme, thereby accelerating the convergence rate by making the learning rate and acceleration coefficient self-adaptive. The convergence of the achieved AP-SGD-based LFA model is theoretically proved. Experimental results on three HDI matrices generated by real industrial applications demonstrate that the AP-SGD-based LFA model outperforms state-of-the-art parallel SGD-based LFA models in both estimation accuracy for missing data and parallelization degree. Hence, it has the potential for efficient representation of HDI data in industrial scenes.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 1","pages":"92-107"},"PeriodicalIF":7.5000,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10292527/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

High-dimensional and incomplete (HDI) interactions among numerous nodes are commonly encountered in a Big Data-related application, like user-item interactions in a recommender system. Owing to its high efficiency and flexibility, a stochastic gradient descent (SGD) algorithm can enable efficient latent feature analysis (LFA) of HDI data for its precise representation, thereby enabling efficient solutions to knowledge acquisition issues like missing data estimation. However, LFA on HDI data involves a bilinear issue, making SGD-based LFA a sequential process, i.e., the update on a feature can impact the results on the others. Intervening the sequence of SGD-based LFA on HDI data can affect the training results. Therefore, a parallel SGD algorithm to LFA should be designed with care. Existing parallel SGD-based LFA models suffer from a) low parallelization degree, and b) slow convergence, which significantly restrict their scalability. Aiming at addressing these vital issues, this paper presents an A daptively-accelerated P arallel S tochastic G radient D escent (AP-SGD) algorithm to LFA by: a) establishing a novel local minimum-based data splitting and scheduling scheme to reduce the scheduling cost among threads, thereby achieving high parallelization degree; and b) incorporating the adaptive momentum method into the learning scheme, thereby accelerating the convergence rate by making the learning rate and acceleration coefficient self-adaptive. The convergence of the achieved AP-SGD-based LFA model is theoretically proved. Experimental results on three HDI matrices generated by real industrial applications demonstrate that the AP-SGD-based LFA model outperforms state-of-the-art parallel SGD-based LFA models in both estimation accuracy for missing data and parallelization degree. Hence, it has the potential for efficient representation of HDI data in industrial scenes.
用于高维和不完整数据表示学习的自适应加速并行随机梯度下降技术
在大数据相关应用(如推荐系统中的用户-项目交互)中,通常会遇到众多节点之间的高维和不完整(HDI)交互。随机梯度下降算法(SGD)具有高效和灵活的特点,可以对 HDI 数据进行高效的潜在特征分析(LFA),以精确地表示 HDI 数据,从而有效地解决缺失数据估计等知识获取问题。然而,HDI 数据的 LFA 涉及双线性问题,这使得基于 SGD 的 LFA 成为一个顺序过程,即一个特征的更新会影响其他特征的结果。在 HDI 数据上干预基于 SGD 的 LFA 的顺序会影响训练结果。因此,应谨慎设计 LFA 的并行 SGD 算法。现有的基于 SGD 的并行 LFA 模型存在以下问题:a)并行化程度低;b)收敛速度慢,这极大地限制了其可扩展性。为了解决这些重要问题,本文提出了一种针对 LFA 的自适应加速并行随机梯度下降算法(AP-SGD):a)建立一种新颖的基于局部最小值的数据分割和调度方案,以降低线程间的调度成本,从而实现高并行化程度;b)将自适应动量法纳入学习方案,通过使学习速率和加速系数自适应来加快收敛速度。理论证明了所实现的基于 AP-SGD 的 LFA 模型的收敛性。在实际工业应用中生成的三个 HDI 矩阵上的实验结果表明,基于 AP-SGD 的 LFA 模型在缺失数据估计精度和并行化程度上都优于最先进的基于 SGD 的并行 LFA 模型。因此,它具有高效表示工业场景中 HDI 数据的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.80
自引率
2.80%
发文量
114
期刊介绍: The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信