FedREAS: A Robust Efficient Aggregation and Selection Framework for Federated Learning

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Asian and Low-Resource Language Information Processing Pub Date : 2024-06-04 DOI:10.1145/3670689

Shuming Fan, Chenpei Wang, Xinyu Ruan, Hongjian Shi, Ruhui Ma, Haibing Guan

{"title":"FedREAS: A Robust Efficient Aggregation and Selection Framework for Federated Learning","authors":"Shuming Fan, Chenpei Wang, Xinyu Ruan, Hongjian Shi, Ruhui Ma, Haibing Guan","doi":"10.1145/3670689","DOIUrl":null,"url":null,"abstract":"<p>In the field of Natural Language Processing (NLP), Deep Learning (DL) and Neural Network (NN) technologies have been widely applied to machine translation and sentiment analysis and have demonstrated outstanding performance. In recent years, NLP applications have also combined multimodal data, such as visual and audio, continuously improving language processing performance. At the same time, the size of Neural Network models is increasing, and many models cannot be deployed on devices with limited computing resources. Deploying models on cloud platforms has become a trend. However, deploying models in the cloud introduces new privacy risks for endpoint data, despite overcoming computational limitations. Federated Learning (FL) methods protect local data by keeping the data on the client side and only sending local updates to the central server. However, the FL architecture still has problems, such as vulnerability to adversarial attacks and non-IID data distribution. In this work, we propose a Federated Learning aggregation method called FedREAS. The server uses a benchmark dataset to train a global model and obtains benchmark updates in this method. Before aggregating local updates, the server adjusts the local updates using the benchmark updates and then returns the adjusted benchmark updates. Then, based on the similarity between the adjusted local updates and the adjusted benchmark updates, the server aggregates these local updates to obtain a more robust update. This method also improves the client selection process. FedREAS selects suitable clients for training at the beginning of each round based on specific strategies, the similarity of the previous round’s updates, and the submitted data. We conduct experiments on different datasets and compare FedREAS with other Federated Learning methods. The results show that FedREAS outperforms other methods regarding model performance and resistance to attacks.</p>","PeriodicalId":54312,"journal":{"name":"ACM Transactions on Asian and Low-Resource Language Information Processing","volume":"70 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Asian and Low-Resource Language Information Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3670689","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of Natural Language Processing (NLP), Deep Learning (DL) and Neural Network (NN) technologies have been widely applied to machine translation and sentiment analysis and have demonstrated outstanding performance. In recent years, NLP applications have also combined multimodal data, such as visual and audio, continuously improving language processing performance. At the same time, the size of Neural Network models is increasing, and many models cannot be deployed on devices with limited computing resources. Deploying models on cloud platforms has become a trend. However, deploying models in the cloud introduces new privacy risks for endpoint data, despite overcoming computational limitations. Federated Learning (FL) methods protect local data by keeping the data on the client side and only sending local updates to the central server. However, the FL architecture still has problems, such as vulnerability to adversarial attacks and non-IID data distribution. In this work, we propose a Federated Learning aggregation method called FedREAS. The server uses a benchmark dataset to train a global model and obtains benchmark updates in this method. Before aggregating local updates, the server adjusts the local updates using the benchmark updates and then returns the adjusted benchmark updates. Then, based on the similarity between the adjusted local updates and the adjusted benchmark updates, the server aggregates these local updates to obtain a more robust update. This method also improves the client selection process. FedREAS selects suitable clients for training at the beginning of each round based on specific strategies, the similarity of the previous round’s updates, and the submitted data. We conduct experiments on different datasets and compare FedREAS with other Federated Learning methods. The results show that FedREAS outperforms other methods regarding model performance and resistance to attacks.

查看原文本刊更多论文

FedREAS：联盟学习的稳健高效聚合和选择框架

在自然语言处理（NLP）领域，深度学习（DL）和神经网络（NN）技术已被广泛应用于机器翻译和情感分析，并表现出卓越的性能。近年来，NLP 应用还结合了视觉和音频等多模态数据，不断提高语言处理性能。与此同时，神经网络模型的规模也在不断扩大，许多模型无法部署在计算资源有限的设备上。在云平台上部署模型已成为一种趋势。然而，尽管克服了计算上的限制，但在云平台上部署模型会给终端数据带来新的隐私风险。联合学习（FL）方法通过将数据保存在客户端并只向中央服务器发送本地更新来保护本地数据。然而，FL 架构仍存在一些问题，如容易受到对抗性攻击和非 IID 数据分发。在这项工作中，我们提出了一种名为 FedREAS 的联邦学习聚合方法。服务器使用基准数据集训练全局模型，并通过这种方法获得基准更新。在聚合本地更新之前，服务器使用基准更新调整本地更新，然后返回调整后的基准更新。然后，根据调整后的本地更新和调整后的基准更新之间的相似性，服务器会聚合这些本地更新，以获得更稳健的更新。这种方法还改进了客户端选择过程。FedREAS 在每一轮开始时都会根据特定策略、上一轮更新的相似性和提交的数据选择合适的客户端进行训练。我们在不同的数据集上进行了实验，并将 FedREAS 与其他联合学习方法进行了比较。结果表明，在模型性能和抗攻击能力方面，FedREAS 优于其他方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Asian and Low-Resource Language Information Processing Computer Science-General Computer Science

CiteScore

3.60

自引率

15.00%

发文量

241

期刊介绍： The ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) publishes high quality original archival papers and technical notes in the areas of computation and processing of information in Asian languages, low-resource languages of Africa, Australasia, Oceania and the Americas, as well as related disciplines. The subject areas covered by TALLIP include, but are not limited to: -Computational Linguistics: including computational phonology, computational morphology, computational syntax (e.g. parsing), computational semantics, computational pragmatics, etc. -Linguistic Resources: including computational lexicography, terminology, electronic dictionaries, cross-lingual dictionaries, electronic thesauri, etc. -Hardware and software algorithms and tools for Asian or low-resource language processing, e.g., handwritten character recognition. -Information Understanding: including text understanding, speech understanding, character recognition, discourse processing, dialogue systems, etc. -Machine Translation involving Asian or low-resource languages. -Information Retrieval: including natural language processing (NLP) for concept-based indexing, natural language query interfaces, semantic relevance judgments, etc. -Information Extraction and Filtering: including automatic abstraction, user profiling, etc. -Speech processing: including text-to-speech synthesis and automatic speech recognition. -Multimedia Asian Information Processing: including speech, image, video, image/text translation, etc. -Cross-lingual information processing involving Asian or low-resource languages. -Papers that deal in theory, systems design, evaluation and applications in the aforesaid subjects are appropriate for TALLIP. Emphasis will be placed on the originality and the practical significance of the reported research.