Privacy-preserving multi-party logistic regression in cloud computing

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Standards & Interfaces Pub Date : 2024-04-10 DOI:10.1016/j.csi.2024.103857

Huiyong Wang , Tianming Chen , Yong Ding , Yujue Wang , Changsong Yang

{"title":"Privacy-preserving multi-party logistic regression in cloud computing","authors":"Huiyong Wang , Tianming Chen , Yong Ding , Yujue Wang , Changsong Yang","doi":"10.1016/j.csi.2024.103857","DOIUrl":null,"url":null,"abstract":"<div><p>In recent years, machine learning techniques have been widely deployed in various fields. However, machine learning faces problems like high computation overhead, low training accuracy, and poor security due to data silos, privacy issues and communication limitations, especially in the environment of cloud computing. Logistic regression (LR) is a popular machine learning method used for prediction, while current LR algorithms suffer from high computation cost and communication burden due to interactions between users and cloud servers. In this paper, we propose a Privacy-Preserving Multi-party Logistic Regression (PPMLR) algorithm, which achieves privacy-preserving and non-interactive gradient descent regression training in machine learning. PPMLR uses the Distributed two Trapdoors Public-Key Cryptosystem (DT-PKC) as a main building block, which satisfies additive homomorphic encryption. Specifically, users go off-line after encrypting local private data, then the service provider (<span><math><mi>SP</mi></math></span>) trains the global logistic regression model by interacting with the cloud server (<span><math><mi>CS</mi></math></span>), so that the confidentiality and privacy of user’s private data can be guaranteed during the training process. We prove by detailed security proof that PPMLR guarantees data and model privacy. Finally, we conduct experiments on two popular medical datasets from the UCI machine learning repository. The experimental results show that PPMLR can conduct privacy-preserving training efficiently. Comparison with the stat-of-the-art Privacy-Preserving Logistic Regression Algorithm (PPLRA) shows that the model training time is reduced by about 4 times.</p></div>","PeriodicalId":50635,"journal":{"name":"Computer Standards & Interfaces","volume":"90 ","pages":"Article 103857"},"PeriodicalIF":4.1000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Standards & Interfaces","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0920548924000266","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, machine learning techniques have been widely deployed in various fields. However, machine learning faces problems like high computation overhead, low training accuracy, and poor security due to data silos, privacy issues and communication limitations, especially in the environment of cloud computing. Logistic regression (LR) is a popular machine learning method used for prediction, while current LR algorithms suffer from high computation cost and communication burden due to interactions between users and cloud servers. In this paper, we propose a Privacy-Preserving Multi-party Logistic Regression (PPMLR) algorithm, which achieves privacy-preserving and non-interactive gradient descent regression training in machine learning. PPMLR uses the Distributed two Trapdoors Public-Key Cryptosystem (DT-PKC) as a main building block, which satisfies additive homomorphic encryption. Specifically, users go off-line after encrypting local private data, then the service provider ( $SP$ ) trains the global logistic regression model by interacting with the cloud server ( $CS$ ), so that the confidentiality and privacy of user’s private data can be guaranteed during the training process. We prove by detailed security proof that PPMLR guarantees data and model privacy. Finally, we conduct experiments on two popular medical datasets from the UCI machine learning repository. The experimental results show that PPMLR can conduct privacy-preserving training efficiently. Comparison with the stat-of-the-art Privacy-Preserving Logistic Regression Algorithm (PPLRA) shows that the model training time is reduced by about 4 times.

查看原文本刊更多论文

云计算中的隐私保护多方逻辑回归

近年来，机器学习技术被广泛应用于各个领域。然而，机器学习面临着计算开销大、训练精度低、数据孤岛导致安全性差、隐私问题和通信限制等问题，尤其是在云计算环境下。逻辑回归（Logistic Regression，LR）是一种用于预测的流行机器学习方法，而目前的 LR 算法由于用户和云服务器之间的交互而存在计算成本高和通信负担重的问题。本文提出了一种隐私保护多方逻辑回归（PPMLR）算法，实现了机器学习中的隐私保护和非交互梯度下降回归训练。PPMLR 以分布式双陷阱公钥密码系统（DT-PKC）为主要构件，满足加法同态加密的要求。具体来说，用户在加密本地私人数据后下线，然后服务提供商（SP）通过与云服务器（CS）交互来训练全局逻辑回归模型，从而在训练过程中保证用户私人数据的机密性和隐私性。我们通过详细的安全证明证明了 PPMLR 可以保证数据和模型的隐私。最后，我们在 UCI 机器学习资料库中的两个流行医学数据集上进行了实验。实验结果表明，PPMLR 可以高效地进行隐私保护训练。与最先进的隐私保护逻辑回归算法（PPLRA）相比，模型训练时间缩短了约4倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Standards & Interfaces 工程技术-计算机：软件工程

CiteScore

11.90

自引率

16.00%

发文量

审稿时长

6 months

期刊介绍： The quality of software, well-defined interfaces (hardware and software), the process of digitalisation, and accepted standards in these fields are essential for building and exploiting complex computing, communication, multimedia and measuring systems. Standards can simplify the design and construction of individual hardware and software components and help to ensure satisfactory interworking. Computer Standards & Interfaces is an international journal dealing specifically with these topics. The journal • Provides information about activities and progress on the definition of computer standards, software quality, interfaces and methods, at national, European and international levels • Publishes critical comments on standards and standards activities • Disseminates user''s experiences and case studies in the application and exploitation of established or emerging standards, interfaces and methods • Offers a forum for discussion on actual projects, standards, interfaces and methods by recognised experts • Stimulates relevant research by providing a specialised refereed medium.