On the Information Theoretic Secure Aggregation With Uncoded Groupwise Keys

IF 2.2 3区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Information Theory Pub Date : 2024-07-02 DOI:10.1109/TIT.2024.3422087

Kai Wan;Xin Yao;Hua Sun;Mingyue Ji;Giuseppe Caire

{"title":"On the Information Theoretic Secure Aggregation With Uncoded Groupwise Keys","authors":"Kai Wan;Xin Yao;Hua Sun;Mingyue Ji;Giuseppe Caire","doi":"10.1109/TIT.2024.3422087","DOIUrl":null,"url":null,"abstract":"Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server. The “secure” nature of such aggregation consists of the fact that no information about the local users’ data must be leaked to the server except the aggregated local models. In order to guarantee security, some keys may be shared among the users (this is referred to as the key sharing phase). After the key sharing phase, each user masks its trained model which is then sent to the server (this is referred to as the model aggregation phase). This paper follows the information theoretic secure aggregation problem originally formulated by Zhao and Sun, with the objective to characterize the minimum communication cost from the \n<inline-formula> <tex-math>$\\mathsf K$ </tex-math></inline-formula>\n users in the model aggregation phase. Due to user dropouts, which are common in real systems, the server may not receive all messages from the users. A secure aggregation scheme should tolerate the dropouts of at most \n<inline-formula> <tex-math>${\\mathsf K}-{\\mathsf U}$ </tex-math></inline-formula>\n users, where \n<inline-formula> <tex-math>$\\mathsf U$ </tex-math></inline-formula>\n is a system parameter. The optimal communication cost is characterized by Zhao and Sun, but with the assumption that the keys stored by the users could be any random variables with arbitrary dependency. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of applications besides federated learning, in this paper we add one constraint into the above problem, namely, that the key variables are mutually independent and each key is shared by a group of \n<inline-formula> <tex-math>$\\mathsf S$ </tex-math></inline-formula>\n users, where \n<inline-formula> <tex-math>$\\mathsf S$ </tex-math></inline-formula>\n is another system parameter. To the best of our knowledge, all existing secure aggregation schemes (with information theoretic security or computational security) assign coded keys to the users. We show that if \n<inline-formula> <tex-math>${\\mathsf S}\\gt {\\mathsf K}-{\\mathsf U}$ </tex-math></inline-formula>\n, a new secure aggregation scheme with uncoded groupwise keys can achieve the same optimal communication cost as the best scheme with coded keys; if \n<inline-formula> <tex-math>${\\mathsf S}\\leq {\\mathsf K}-{\\mathsf U}$ </tex-math></inline-formula>\n, uncoded groupwise key sharing is strictly sub-optimal. Finally, we also implement our proposed secure aggregation scheme into Amazon EC2, which are then compared with the existing secure aggregation schemes with offline key sharing.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 9","pages":"6596-6619"},"PeriodicalIF":2.2000,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10580953/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server. The “secure” nature of such aggregation consists of the fact that no information about the local users’ data must be leaked to the server except the aggregated local models. In order to guarantee security, some keys may be shared among the users (this is referred to as the key sharing phase). After the key sharing phase, each user masks its trained model which is then sent to the server (this is referred to as the model aggregation phase). This paper follows the information theoretic secure aggregation problem originally formulated by Zhao and Sun, with the objective to characterize the minimum communication cost from the

$\mathsf K$

users in the model aggregation phase. Due to user dropouts, which are common in real systems, the server may not receive all messages from the users. A secure aggregation scheme should tolerate the dropouts of at most

${\mathsf K}-{\mathsf U}$

users, where

$\mathsf U$

is a system parameter. The optimal communication cost is characterized by Zhao and Sun, but with the assumption that the keys stored by the users could be any random variables with arbitrary dependency. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of applications besides federated learning, in this paper we add one constraint into the above problem, namely, that the key variables are mutually independent and each key is shared by a group of

$\mathsf S$

users, where

$\mathsf S$

is another system parameter. To the best of our knowledge, all existing secure aggregation schemes (with information theoretic security or computational security) assign coded keys to the users. We show that if

${\mathsf S}\gt {\mathsf K}-{\mathsf U}$

, a new secure aggregation scheme with uncoded groupwise keys can achieve the same optimal communication cost as the best scheme with coded keys; if

${\mathsf S}\leq {\mathsf K}-{\mathsf U}$

, uncoded groupwise key sharing is strictly sub-optimal. Finally, we also implement our proposed secure aggregation scheme into Amazon EC2, which are then compared with the existing secure aggregation schemes with offline key sharing.

查看原文本刊更多论文

论使用未编码群组密钥的信息理论安全聚合

安全聚合是联合学习的核心组成部分，它将分布式用户训练好的本地模型聚合到中央服务器上。这种聚合的 "安全 "本质在于，除了聚合的本地模型外，任何有关本地用户数据的信息都不得泄露给服务器。为了保证安全，用户之间可以共享一些密钥（这被称为密钥共享阶段）。密钥共享阶段结束后，每个用户将其训练好的模型屏蔽，然后发送给服务器（称为模型聚合阶段）。本文沿用赵和孙最初提出的信息论安全聚合问题，目标是描述在模型聚合阶段来自 $\mathsf K$ 用户的最小通信成本。由于现实系统中常见的用户掉线现象，服务器可能无法收到来自用户的所有信息。一个安全的聚合方案应该最多容忍 ${\mathsf K}-{\mathsf U}$ 用户的放弃，其中 $\mathsf U$ 是一个系统参数。赵和孙对最优通信成本进行了描述，但假设用户存储的密钥可以是任意依赖的随机变量。基于无编码的分组密钥更便于共享，而且除了联合学习外还可以广泛应用的动机，本文在上述问题中增加了一个约束，即密钥变量是相互独立的，每个密钥由一组 $mathsf S$ 用户共享，其中 $mathsf S$ 是另一个系统参数。据我们所知，所有现有的安全聚合方案（具有信息论安全性或计算安全性）都为用户分配了编码密钥。我们证明，如果${/mathsf S}\gt {\mathsf K}-{\mathsf U}$，那么使用无编码分组密钥的新安全聚合方案可以达到与使用编码密钥的最佳方案相同的最优通信成本；如果${/mathsf S}\leq {\mathsf K}-{\mathsf U}$，那么无编码分组密钥共享严格来说是次优的。最后，我们还在亚马逊 EC2 中实现了我们提出的安全聚合方案，并将其与现有的离线密钥共享安全聚合方案进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Information Theory 工程技术-工程：电子与电气

CiteScore

5.70

自引率

20.00%

发文量

514

审稿时长

12 months

期刊介绍： The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.