Federated Learning to Preserve the Privacy of User Data

2023 Somaiya International Conference on Technology and Information Management (SICTIM) Pub Date : 2023-03-24 DOI:10.1109/SICTIM56495.2023.10104860

Harsh Shah, Rutwik Patel, Prachi Tawde

{"title":"Federated Learning to Preserve the Privacy of User Data","authors":"Harsh Shah, Rutwik Patel, Prachi Tawde","doi":"10.1109/SICTIM56495.2023.10104860","DOIUrl":null,"url":null,"abstract":"In today’s world, the most pressing problem is the privacy of users’ data. It becomes even more critical when dealing with medical data, which is extremely sensitive. On the other hand, traditional machine learning (ML) algorithms require a single centralized source of data, which frequently compromises data privacy because data must be shared in order for the algorithm to be trained. As a result, federated learning is utilized to train ML algorithms on local private data sets spread across many sites. This also safeguards the privacy of the data. Our suggested method attempts to diagnose two urinary system diseases. We begin by training our model with logistic regression. Then, in order to simulate federated learning, we divided our data set into three sections. Then, each party trains the ML model on its local data set, and all of the updates from the local models are delivered to the trusted aggregator, which averages all of the updates. The averaged model is then sent to all individual parties at the start of each iteration. This entire system aids in the successful identification of two urinary system disorders by utilizing federated learning and distributed data principles in which data is not shared between individual parties and remains in distinct locations, considerably improving the privacy of such sensitive data.","PeriodicalId":244947,"journal":{"name":"2023 Somaiya International Conference on Technology and Information Management (SICTIM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Somaiya International Conference on Technology and Information Management (SICTIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SICTIM56495.2023.10104860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In today’s world, the most pressing problem is the privacy of users’ data. It becomes even more critical when dealing with medical data, which is extremely sensitive. On the other hand, traditional machine learning (ML) algorithms require a single centralized source of data, which frequently compromises data privacy because data must be shared in order for the algorithm to be trained. As a result, federated learning is utilized to train ML algorithms on local private data sets spread across many sites. This also safeguards the privacy of the data. Our suggested method attempts to diagnose two urinary system diseases. We begin by training our model with logistic regression. Then, in order to simulate federated learning, we divided our data set into three sections. Then, each party trains the ML model on its local data set, and all of the updates from the local models are delivered to the trusted aggregator, which averages all of the updates. The averaged model is then sent to all individual parties at the start of each iteration. This entire system aids in the successful identification of two urinary system disorders by utilizing federated learning and distributed data principles in which data is not shared between individual parties and remains in distinct locations, considerably improving the privacy of such sensitive data.

查看原文本刊更多论文

联邦学习保护用户数据的隐私

在当今世界，最紧迫的问题是用户数据的隐私。在处理极其敏感的医疗数据时，这一点变得更加重要。另一方面，传统的机器学习(ML)算法需要一个单一的集中数据源，这通常会损害数据隐私，因为为了训练算法，必须共享数据。因此，联邦学习被用于在分布在许多站点的本地私有数据集上训练ML算法。这也保护了数据的隐私。我们建议的方法试图诊断两种泌尿系统疾病。我们首先用逻辑回归训练我们的模型。然后，为了模拟联邦学习，我们将数据集分为三个部分。然后，每一方在其本地数据集上训练ML模型，并且来自本地模型的所有更新都被传递给可信聚合器，该聚合器对所有更新进行平均。然后，在每次迭代开始时，将平均模型发送给所有单独的参与方。整个系统通过利用联合学习和分布式数据原则，帮助成功识别两种泌尿系统疾病，其中数据不在个人之间共享，并保留在不同的位置，大大提高了此类敏感数据的隐私性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 Somaiya International Conference on Technology and Information Management (SICTIM)

自引率

0.00%

发文量