{"title":"Federated learning-based natural language processing: a systematic literature review","authors":"Younas Khan, David Sánchez, Josep Domingo-Ferrer","doi":"10.1007/s10462-024-10970-5","DOIUrl":null,"url":null,"abstract":"<div><p>Federated learning (FL) is a decentralized machine learning (ML) framework that allows models to be trained without sharing the participants’ local data. FL thus preserves privacy better than centralized machine learning. Since textual data (such as clinical records, posts in social networks, or search queries) often contain personal information, many natural language processing (NLP) tasks dealing with such data have shifted from the centralized to the FL setting. However, FL is not free from issues, including convergence and security vulnerabilities (due to unreliable or poisoned data introduced into the model), communication and computation bottlenecks, and even privacy attacks orchestrated by honest-but-curious servers. In this paper, we present a systematic literature review (SLR) of NLP applications in FL with a special focus on FL issues and the solutions proposed so far. Our review surveys 36 recent papers published in relevant venues, which are systematically analyzed and compared from multiple perspectives. As a result of the survey, we also identify the most outstanding challenges in the area.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"57 12","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10970-5.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-10970-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning (FL) is a decentralized machine learning (ML) framework that allows models to be trained without sharing the participants’ local data. FL thus preserves privacy better than centralized machine learning. Since textual data (such as clinical records, posts in social networks, or search queries) often contain personal information, many natural language processing (NLP) tasks dealing with such data have shifted from the centralized to the FL setting. However, FL is not free from issues, including convergence and security vulnerabilities (due to unreliable or poisoned data introduced into the model), communication and computation bottlenecks, and even privacy attacks orchestrated by honest-but-curious servers. In this paper, we present a systematic literature review (SLR) of NLP applications in FL with a special focus on FL issues and the solutions proposed so far. Our review surveys 36 recent papers published in relevant venues, which are systematically analyzed and compared from multiple perspectives. As a result of the survey, we also identify the most outstanding challenges in the area.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.