Machine Learning with Distributed Processing using Secure Divided Data: Towards Privacy-Preserving Advanced AI Processing in a Super-Smart Society

Journal of Networking and Network Applications Pub Date : 1900-01-01 DOI:10.33969/j-nana.2022.020105

H. Miyajima, Noritaka Shigei, H. Miyajima, N. Shiratori

{"title":"Machine Learning with Distributed Processing using Secure Divided Data: Towards Privacy-Preserving Advanced AI Processing in a Super-Smart Society","authors":"H. Miyajima, Noritaka Shigei, H. Miyajima, N. Shiratori","doi":"10.33969/j-nana.2022.020105","DOIUrl":null,"url":null,"abstract":"Towards the realization of a super-smart society, AI analysis methods that preserve the privacy of big data in cyberspace are being developed. From the viewpoint of developing machine learning as a secure and safe AI analysis method for users, many studies have been conducted in this field on 1) secure multiparty computation (SMC), 2) quasi-homomorphic encryption, and 3) federated learning, among other techniques. Previous studies have shown that both security and utility are essential for machine learning using confidential data. However, there is a trade-off between these two properties, and there are no known methods that satisfy both simultaneously at a high level. In this paper, as a superior method in both privacy-preserving of data and utility, we propose a learning method based on distributed processing using simple, secure, divided data and parameters. In this method, individual data and parameters are divided into multiple pieces using random numbers in advance, and each piece is stored in each server. The learning of the proposed method is achieved by using these data and parameters as they are divided and by repeating partial computations on each server and integrated computations at the central server. The advantages of the proposed method are the preservation of data privacy by not restoring the data and parameters during learning; the improvement of usability by realizing a machine learning method based on distributed processing, as federated learning does; and almost no degradation in accuracy compared to conventional methods. Based on the proposed method, we propose backpropagation and neural gas (NG) algorithms as examples of supervised and unsupervised machine learning applications. Our numerical simulation shows that these algorithms can achieve accuracy comparable to conventional models.","PeriodicalId":384373,"journal":{"name":"Journal of Networking and Network Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Networking and Network Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33969/j-nana.2022.020105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Towards the realization of a super-smart society, AI analysis methods that preserve the privacy of big data in cyberspace are being developed. From the viewpoint of developing machine learning as a secure and safe AI analysis method for users, many studies have been conducted in this field on 1) secure multiparty computation (SMC), 2) quasi-homomorphic encryption, and 3) federated learning, among other techniques. Previous studies have shown that both security and utility are essential for machine learning using confidential data. However, there is a trade-off between these two properties, and there are no known methods that satisfy both simultaneously at a high level. In this paper, as a superior method in both privacy-preserving of data and utility, we propose a learning method based on distributed processing using simple, secure, divided data and parameters. In this method, individual data and parameters are divided into multiple pieces using random numbers in advance, and each piece is stored in each server. The learning of the proposed method is achieved by using these data and parameters as they are divided and by repeating partial computations on each server and integrated computations at the central server. The advantages of the proposed method are the preservation of data privacy by not restoring the data and parameters during learning; the improvement of usability by realizing a machine learning method based on distributed processing, as federated learning does; and almost no degradation in accuracy compared to conventional methods. Based on the proposed method, we propose backpropagation and neural gas (NG) algorithms as examples of supervised and unsupervised machine learning applications. Our numerical simulation shows that these algorithms can achieve accuracy comparable to conventional models.

查看原文本刊更多论文

使用安全分割数据进行分布式处理的机器学习:在超级智能社会中实现保护隐私的高级人工智能处理

为了实现超级智能社会，正在开发保护网络空间大数据隐私的人工智能分析方法。从发展机器学习作为一种对用户安全可靠的人工智能分析方法的角度出发，该领域在1)安全多方计算(SMC)， 2)拟同态加密和3)联邦学习等技术方面进行了许多研究。先前的研究表明，安全性和实用性对于使用机密数据的机器学习至关重要。然而，在这两个属性之间存在权衡，并且没有已知的方法在高层次上同时满足这两个属性。本文提出了一种基于分布式处理的学习方法，该方法使用简单、安全、可分割的数据和参数进行学习。在这种方法中，个人数据和参数预先使用随机数分成多个片段，每个片段存储在每个服务器中。所提出的方法的学习是通过使用这些数据和参数进行分割，并通过在每个服务器上重复部分计算和在中心服务器上重复集成计算来实现的。该方法的优点是在学习过程中不恢复数据和参数，从而保护了数据的隐私;通过实现一种基于分布式处理的机器学习方法来提高可用性，如联邦学习;与传统方法相比，精度几乎没有下降。基于所提出的方法，我们提出了反向传播和神经气体(NG)算法作为有监督和无监督机器学习应用的示例。我们的数值模拟表明，这些算法可以达到与传统模型相当的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Networking and Network Applications

自引率

0.00%

发文量