DAGC:数据感知自适应梯度压缩

R. Lu, Jiajun Song, B. Chen, Laizhong Cui, Zhi Wang
{"title":"DAGC:数据感知自适应梯度压缩","authors":"R. Lu, Jiajun Song, B. Chen, Laizhong Cui, Zhi Wang","doi":"10.1109/INFOCOM53939.2023.10229053","DOIUrl":null,"url":null,"abstract":"Gradient compression algorithms are widely used to alleviate the communication bottleneck in distributed ML. However, existing gradient compression algorithms suffer from accuracy degradation in Non-IID scenarios, because a uniform compression scheme is used to compress gradients at workers with different data distributions and volumes, since workers with larger volumes of data are forced to adapt to the same aggressive compression ratios as others. Assigning different compression ratios to workers with different data distributions and volumes is thus a promising solution. In this study, we first derive a function from capturing the correlation between the number of training iterations for a model to converge to the same accuracy, and the compression ratios at different workers; This function particularly shows that workers with larger data volumes should be assigned with higher compression ratios1 to guarantee better accuracy. Then, we formulate the assignment of compression ratios to the workers as an n-variables chi-square nonlinear optimization problem under fixed and limited total communication constrain. We propose an adaptive gradient compression strategy called DAGC, which assigns each worker a different compression ratio according to their data volumes. Our experiments confirm that DAGC can achieve better performance facing highly imbalanced data volume distribution and restricted communication.","PeriodicalId":387707,"journal":{"name":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DAGC: Data-Aware Adaptive Gradient Compression\",\"authors\":\"R. Lu, Jiajun Song, B. Chen, Laizhong Cui, Zhi Wang\",\"doi\":\"10.1109/INFOCOM53939.2023.10229053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gradient compression algorithms are widely used to alleviate the communication bottleneck in distributed ML. However, existing gradient compression algorithms suffer from accuracy degradation in Non-IID scenarios, because a uniform compression scheme is used to compress gradients at workers with different data distributions and volumes, since workers with larger volumes of data are forced to adapt to the same aggressive compression ratios as others. Assigning different compression ratios to workers with different data distributions and volumes is thus a promising solution. In this study, we first derive a function from capturing the correlation between the number of training iterations for a model to converge to the same accuracy, and the compression ratios at different workers; This function particularly shows that workers with larger data volumes should be assigned with higher compression ratios1 to guarantee better accuracy. Then, we formulate the assignment of compression ratios to the workers as an n-variables chi-square nonlinear optimization problem under fixed and limited total communication constrain. We propose an adaptive gradient compression strategy called DAGC, which assigns each worker a different compression ratio according to their data volumes. Our experiments confirm that DAGC can achieve better performance facing highly imbalanced data volume distribution and restricted communication.\",\"PeriodicalId\":387707,\"journal\":{\"name\":\"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INFOCOM53939.2023.10229053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM53939.2023.10229053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

梯度压缩算法被广泛用于缓解分布式机器学习中的通信瓶颈。然而,现有的梯度压缩算法在非iid场景中存在精度下降的问题,因为使用统一的压缩方案来压缩具有不同数据分布和容量的工人的梯度,因为具有较大数据量的工人被迫适应与其他工人相同的积极压缩比。因此,为具有不同数据分布和数据量的工作分配不同的压缩比是一个很有前途的解决方案。在本研究中,我们首先通过捕获模型收敛到相同精度的训练迭代次数与不同工人的压缩比之间的相关性推导出一个函数;该函数特别表明,数据量较大的工人应该分配更高的压缩比1,以保证更好的准确性。然后,我们将压缩比分配化为固定有限总通信约束下的n变量卡方非线性优化问题。我们提出了一种称为DAGC的自适应梯度压缩策略,该策略根据每个工人的数据量分配不同的压缩比。实验证明,在数据量分布高度不平衡和通信受限的情况下,DAGC可以获得更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
DAGC: Data-Aware Adaptive Gradient Compression
Gradient compression algorithms are widely used to alleviate the communication bottleneck in distributed ML. However, existing gradient compression algorithms suffer from accuracy degradation in Non-IID scenarios, because a uniform compression scheme is used to compress gradients at workers with different data distributions and volumes, since workers with larger volumes of data are forced to adapt to the same aggressive compression ratios as others. Assigning different compression ratios to workers with different data distributions and volumes is thus a promising solution. In this study, we first derive a function from capturing the correlation between the number of training iterations for a model to converge to the same accuracy, and the compression ratios at different workers; This function particularly shows that workers with larger data volumes should be assigned with higher compression ratios1 to guarantee better accuracy. Then, we formulate the assignment of compression ratios to the workers as an n-variables chi-square nonlinear optimization problem under fixed and limited total communication constrain. We propose an adaptive gradient compression strategy called DAGC, which assigns each worker a different compression ratio according to their data volumes. Our experiments confirm that DAGC can achieve better performance facing highly imbalanced data volume distribution and restricted communication.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信