Sparsity-Preserving Encodings for Straggler-Optimal Distributed Matrix Computations at the Edge

arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2024-08-09 DOI:arxiv-2408.05152

Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton

{"title":"Sparsity-Preserving Encodings for Straggler-Optimal Distributed Matrix Computations at the Edge","authors":"Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton","doi":"arxiv-2408.05152","DOIUrl":null,"url":null,"abstract":"Matrix computations are a fundamental building-block of edge computing\nsystems, with a major recent uptick in demand due to their use in AI/ML\ntraining and inference procedures. Existing approaches for distributing matrix\ncomputations involve allocating coded combinations of submatrices to worker\nnodes, to build resilience to slower nodes, called stragglers. In the edge\nlearning context, however, these approaches will compromise sparsity properties\nthat are often present in the original matrices found at the edge server. In\nthis study, we consider the challenge of augmenting such approaches to preserve\ninput sparsity when distributing the task across edge devices, thereby\nretaining the associated computational efficiency enhancements. First, we find\na lower bound on the weight of coding, i.e., the number of submatrices to be\ncombined to obtain coded submatrices, to provide the resilience to the maximum\npossible number of straggler devices (for given number of devices and their\nstorage constraints). Next we propose distributed matrix computation schemes\nwhich meet the exact lower bound on the weight of the coding. Numerical\nexperiments conducted in Amazon Web Services (AWS) validate our assertions\nregarding straggler mitigation and computation speed for sparse matrices.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"14 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Matrix computations are a fundamental building-block of edge computing systems, with a major recent uptick in demand due to their use in AI/ML training and inference procedures. Existing approaches for distributing matrix computations involve allocating coded combinations of submatrices to worker nodes, to build resilience to slower nodes, called stragglers. In the edge learning context, however, these approaches will compromise sparsity properties that are often present in the original matrices found at the edge server. In this study, we consider the challenge of augmenting such approaches to preserve input sparsity when distributing the task across edge devices, thereby retaining the associated computational efficiency enhancements. First, we find a lower bound on the weight of coding, i.e., the number of submatrices to be combined to obtain coded submatrices, to provide the resilience to the maximum possible number of straggler devices (for given number of devices and their storage constraints). Next we propose distributed matrix computation schemes which meet the exact lower bound on the weight of the coding. Numerical experiments conducted in Amazon Web Services (AWS) validate our assertions regarding straggler mitigation and computation speed for sparse matrices.

查看原文本刊更多论文

边缘最优分布式矩阵计算的稀疏性保护编码

矩阵计算是边缘计算系统的基本组成部分，最近由于其在人工智能/ML 训练和推理程序中的应用而需求大增。分配矩阵计算的现有方法包括将子矩阵的编码组合分配给工作节点，以建立对较慢节点（称为落后节点）的弹性。但是，在边缘学习环境中，这些方法会损害边缘服务器上原始矩阵中经常出现的稀疏性。在这项研究中，我们考虑的挑战是如何增强这些方法，以便在将任务分配给边缘设备时保持输入稀疏性，从而保持相关的计算效率提升。首先，我们找到了编码权重的下限，即为获得编码子矩阵而需要组合的子矩阵数量，以提供对最大可能数量的游离设备的弹性（在给定设备数量和存储约束条件下）。接下来，我们提出了符合编码权重精确下限的分布式矩阵计算方案。在亚马逊网络服务（AWS）中进行的数值实验验证了我们关于稀疏矩阵的流浪者缓解和计算速度的论断。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Distributed, Parallel, and Cluster Computing

自引率

0.00%

发文量