Sparsity-Preserving Encodings for Straggler-Optimal Distributed Matrix Computations at the Edge

Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton
{"title":"Sparsity-Preserving Encodings for Straggler-Optimal Distributed Matrix Computations at the Edge","authors":"Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton","doi":"arxiv-2408.05152","DOIUrl":null,"url":null,"abstract":"Matrix computations are a fundamental building-block of edge computing\nsystems, with a major recent uptick in demand due to their use in AI/ML\ntraining and inference procedures. Existing approaches for distributing matrix\ncomputations involve allocating coded combinations of submatrices to worker\nnodes, to build resilience to slower nodes, called stragglers. In the edge\nlearning context, however, these approaches will compromise sparsity properties\nthat are often present in the original matrices found at the edge server. In\nthis study, we consider the challenge of augmenting such approaches to preserve\ninput sparsity when distributing the task across edge devices, thereby\nretaining the associated computational efficiency enhancements. First, we find\na lower bound on the weight of coding, i.e., the number of submatrices to be\ncombined to obtain coded submatrices, to provide the resilience to the maximum\npossible number of straggler devices (for given number of devices and their\nstorage constraints). Next we propose distributed matrix computation schemes\nwhich meet the exact lower bound on the weight of the coding. Numerical\nexperiments conducted in Amazon Web Services (AWS) validate our assertions\nregarding straggler mitigation and computation speed for sparse matrices.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"14 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Matrix computations are a fundamental building-block of edge computing systems, with a major recent uptick in demand due to their use in AI/ML training and inference procedures. Existing approaches for distributing matrix computations involve allocating coded combinations of submatrices to worker nodes, to build resilience to slower nodes, called stragglers. In the edge learning context, however, these approaches will compromise sparsity properties that are often present in the original matrices found at the edge server. In this study, we consider the challenge of augmenting such approaches to preserve input sparsity when distributing the task across edge devices, thereby retaining the associated computational efficiency enhancements. First, we find a lower bound on the weight of coding, i.e., the number of submatrices to be combined to obtain coded submatrices, to provide the resilience to the maximum possible number of straggler devices (for given number of devices and their storage constraints). Next we propose distributed matrix computation schemes which meet the exact lower bound on the weight of the coding. Numerical experiments conducted in Amazon Web Services (AWS) validate our assertions regarding straggler mitigation and computation speed for sparse matrices.
边缘最优分布式矩阵计算的稀疏性保护编码
矩阵计算是边缘计算系统的基本组成部分,最近由于其在人工智能/ML 训练和推理程序中的应用而需求大增。分配矩阵计算的现有方法包括将子矩阵的编码组合分配给工作节点,以建立对较慢节点(称为落后节点)的弹性。但是,在边缘学习环境中,这些方法会损害边缘服务器上原始矩阵中经常出现的稀疏性。在这项研究中,我们考虑的挑战是如何增强这些方法,以便在将任务分配给边缘设备时保持输入稀疏性,从而保持相关的计算效率提升。首先,我们找到了编码权重的下限,即为获得编码子矩阵而需要组合的子矩阵数量,以提供对最大可能数量的游离设备的弹性(在给定设备数量和存储约束条件下)。接下来,我们提出了符合编码权重精确下限的分布式矩阵计算方案。在亚马逊网络服务(AWS)中进行的数值实验验证了我们关于稀疏矩阵的流浪者缓解和计算速度的论断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信