分布式编码的编码计算系统

IF 2.9 3区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Information Theory Pub Date : 2025-07-22 DOI:10.1109/TIT.2025.3591523

Nastaran Abadi Khooshemehr;Mohammad Ali Maddah-Ali

{"title":"分布式编码的编码计算系统","authors":"Nastaran Abadi Khooshemehr;Mohammad Ali Maddah-Ali","doi":"10.1109/TIT.2025.3591523","DOIUrl":null,"url":null,"abstract":"Coded computing has proved to be useful in distributed computing, and has addressed challenges such as straggler workers. We have observed that almost all coded computing systems studied so far consider a setup of one leader and some workers. However, recently emerging technologies such as blockchain, internet of things, and federated learning introduce new requirements for coded computing systems. In these systems, data is generated (and probably stored) in a distributed manner, so central encoding/decoding by a leader is not feasible and scalable. This paper presents a multi-leader distributed coded computing system that consists of <inline-formula> <tex-math>$k\\in \\mathbb {N}$ </tex-math></inline-formula> data owners and <inline-formula> <tex-math>$N\\in \\mathbb {N}$ </tex-math></inline-formula> workers, where data owners employ workers to do some computations on their data, as specified by a target function <italic>f</i> of degree <inline-formula> <tex-math>$d\\in \\mathbb {N}$ </tex-math></inline-formula>. As there is no central encoder, workers perform encoding themselves, prior to computation phase. The challenge in this system is the presence of adversarial data owners that do not know the data of honest data owners but cause discrepancies by sending different versions of data to different workers, which is detrimental to local encodings in workers. There are at most <inline-formula> <tex-math>$\\beta \\in \\mathbb {N}$ </tex-math></inline-formula> adversarial data owners, and each distributes at most <inline-formula> <tex-math>$v\\in \\mathbb {N}$ </tex-math></inline-formula> different versions of data. Since the adversaries and their possibly colluded behavior are not known to workers and honest data owners, workers compute tags of their received data, in addition to their main computational task, and send them to data owners in order to help them in decoding. We introduce a tag function that allows data owners to partition workers into sets that previously had received the same data from all data owners. Then, we characterize the fundamental limit of this multi-leader distributed coded computing system, denoted by <inline-formula> <tex-math>$t^{*}$ </tex-math></inline-formula>, which is the minimum number of workers whose work can be used to correctly calculate the desired function of data of honest data owners. We show that <inline-formula> <tex-math>$t^{*}=v^{\\beta }d(K-1)+1$ </tex-math></inline-formula>, and present converse and achievable proofs.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"71 10","pages":"7609-7625"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vers: Coded Computing System With Distributed Encoding\",\"authors\":\"Nastaran Abadi Khooshemehr;Mohammad Ali Maddah-Ali\",\"doi\":\"10.1109/TIT.2025.3591523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Coded computing has proved to be useful in distributed computing, and has addressed challenges such as straggler workers. We have observed that almost all coded computing systems studied so far consider a setup of one leader and some workers. However, recently emerging technologies such as blockchain, internet of things, and federated learning introduce new requirements for coded computing systems. In these systems, data is generated (and probably stored) in a distributed manner, so central encoding/decoding by a leader is not feasible and scalable. This paper presents a multi-leader distributed coded computing system that consists of <inline-formula> <tex-math>$k\\\\in \\\\mathbb {N}$ </tex-math></inline-formula> data owners and <inline-formula> <tex-math>$N\\\\in \\\\mathbb {N}$ </tex-math></inline-formula> workers, where data owners employ workers to do some computations on their data, as specified by a target function <italic>f</i> of degree <inline-formula> <tex-math>$d\\\\in \\\\mathbb {N}$ </tex-math></inline-formula>. As there is no central encoder, workers perform encoding themselves, prior to computation phase. The challenge in this system is the presence of adversarial data owners that do not know the data of honest data owners but cause discrepancies by sending different versions of data to different workers, which is detrimental to local encodings in workers. There are at most <inline-formula> <tex-math>$\\\\beta \\\\in \\\\mathbb {N}$ </tex-math></inline-formula> adversarial data owners, and each distributes at most <inline-formula> <tex-math>$v\\\\in \\\\mathbb {N}$ </tex-math></inline-formula> different versions of data. Since the adversaries and their possibly colluded behavior are not known to workers and honest data owners, workers compute tags of their received data, in addition to their main computational task, and send them to data owners in order to help them in decoding. We introduce a tag function that allows data owners to partition workers into sets that previously had received the same data from all data owners. Then, we characterize the fundamental limit of this multi-leader distributed coded computing system, denoted by <inline-formula> <tex-math>$t^{*}$ </tex-math></inline-formula>, which is the minimum number of workers whose work can be used to correctly calculate the desired function of data of honest data owners. We show that <inline-formula> <tex-math>$t^{*}=v^{\\\\beta }d(K-1)+1$ </tex-math></inline-formula>, and present converse and achievable proofs.\",\"PeriodicalId\":13494,\"journal\":{\"name\":\"IEEE Transactions on Information Theory\",\"volume\":\"71 10\",\"pages\":\"7609-7625\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Theory\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11088248/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11088248/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

编码计算已经被证明在分布式计算中是有用的，并且已经解决了诸如离散工作者之类的挑战。我们已经观察到，迄今为止研究的几乎所有编码计算系统都考虑了一个领导者和一些工人的设置。然而，最近出现的技术，如区块链、物联网和联邦学习，对编码计算系统提出了新的要求。在这些系统中，数据以分布式的方式生成（也可能存储），因此由领导者进行中央编码/解码是不可行的，也是不可扩展的。本文提出了一个由$k\in \mathbb {N}$数据所有者和$N\in \mathbb {N}$ worker组成的多leader分布式编码计算系统，其中数据所有者使用worker对其数据进行一些计算，由$d\in \mathbb {N}$的目标函数f指定。由于没有中央编码器，工作人员在计算阶段之前自行执行编码。这个系统的挑战是存在对抗性的数据所有者，他们不知道诚实的数据所有者的数据，而是通过向不同的工人发送不同版本的数据而导致差异，这对工人的本地编码是有害的。在\mathbb {N}$中最多有$\beta \对抗性数据所有者，并且每个所有者最多分发$v\ \在\mathbb {N}$中不同版本的数据。由于工作人员和诚实的数据所有者不知道对手及其可能的勾结行为，因此工作人员除了计算其主要计算任务外，还计算其接收数据的标签，并将其发送给数据所有者，以帮助他们解码。我们引入了一个标记函数，该函数允许数据所有者将工作者划分为先前从所有数据所有者那里接收到相同数据的集合。然后，我们描述了这个多领导者分布式编码计算系统的基本极限，用$t^{*}$表示，它是可以用来正确计算诚实数据所有者的数据期望函数的最小工人数量。我们证明了$t^{*}=v^{\beta}d(K-1)+1$，并给出了反向和可实现的证明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Vers: Coded Computing System With Distributed Encoding

Coded computing has proved to be useful in distributed computing, and has addressed challenges such as straggler workers. We have observed that almost all coded computing systems studied so far consider a setup of one leader and some workers. However, recently emerging technologies such as blockchain, internet of things, and federated learning introduce new requirements for coded computing systems. In these systems, data is generated (and probably stored) in a distributed manner, so central encoding/decoding by a leader is not feasible and scalable. This paper presents a multi-leader distributed coded computing system that consists of

$k\in \mathbb {N}$

data owners and

$N\in \mathbb {N}$

workers, where data owners employ workers to do some computations on their data, as specified by a target function f of degree

$d\in \mathbb {N}$

. As there is no central encoder, workers perform encoding themselves, prior to computation phase. The challenge in this system is the presence of adversarial data owners that do not know the data of honest data owners but cause discrepancies by sending different versions of data to different workers, which is detrimental to local encodings in workers. There are at most

$\beta \in \mathbb {N}$

adversarial data owners, and each distributes at most

$v\in \mathbb {N}$

different versions of data. Since the adversaries and their possibly colluded behavior are not known to workers and honest data owners, workers compute tags of their received data, in addition to their main computational task, and send them to data owners in order to help them in decoding. We introduce a tag function that allows data owners to partition workers into sets that previously had received the same data from all data owners. Then, we characterize the fundamental limit of this multi-leader distributed coded computing system, denoted by

$t^{*}$

, which is the minimum number of workers whose work can be used to correctly calculate the desired function of data of honest data owners. We show that

$t^{*}=v^{\beta }d(K-1)+1$

, and present converse and achievable proofs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Information Theory 工程技术-工程：电子与电气

CiteScore

5.70

自引率

20.00%

发文量

514

审稿时长

12 months

期刊介绍： The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.