Enhancing Accuracy-Privacy Trade-Off in Differentially Private Split Learning

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-10-31 DOI:10.1109/TETCI.2024.3485723

Ngoc Duy Pham;Khoa T. Phan;Naveen Chilamkurti

{"title":"Enhancing Accuracy-Privacy Trade-Off in Differentially Private Split Learning","authors":"Ngoc Duy Pham;Khoa T. Phan;Naveen Chilamkurti","doi":"10.1109/TETCI.2024.3485723","DOIUrl":null,"url":null,"abstract":"Split learning (SL) aims to protect user data privacy by distributing deep models between the client-server and keeping private data locally. Only processed or ‘smashed’ data can be transmitted from the clients to the server during the SL process. However, recently proposed model inversion attacks can recover original data from smashed data. To enhance privacy protection against such attacks, one strategy is to adopt differential privacy (DP), which involves safeguarding the smashed data at the expense of some accuracy loss. This paper presents the first investigation into the impact on accuracy when training multiple clients in SL with various privacy requirements. Subsequently, we propose an approach that reviews the DP noise distributions of other clients during client training to address the identified accuracy degradation. We also examine the application of DP to the local model of SL to gain insights into the trade-off between accuracy and privacy. Specifically, the findings reveal that introducing noise in the later local layers offers the most favorable balance between accuracy and privacy. Drawing from our insights in the shallower layers, we propose an approach to reduce the size of smashed data to minimize data leakage while maintaining higher accuracy, optimizing the accuracy-privacy trade-off. Additionally, smashed data of a smaller size reduces communication overhead on the client side, mitigating one of the notable drawbacks of SL. Intensive experiments on various datasets demonstrate that our proposed approaches provide an optimal trade-off for incorporating DP into SL, ultimately enhancing the training accuracy for multi-client SL with varying privacy requirements.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 1","pages":"988-1000"},"PeriodicalIF":5.3000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10740400/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Split learning (SL) aims to protect user data privacy by distributing deep models between the client-server and keeping private data locally. Only processed or ‘smashed’ data can be transmitted from the clients to the server during the SL process. However, recently proposed model inversion attacks can recover original data from smashed data. To enhance privacy protection against such attacks, one strategy is to adopt differential privacy (DP), which involves safeguarding the smashed data at the expense of some accuracy loss. This paper presents the first investigation into the impact on accuracy when training multiple clients in SL with various privacy requirements. Subsequently, we propose an approach that reviews the DP noise distributions of other clients during client training to address the identified accuracy degradation. We also examine the application of DP to the local model of SL to gain insights into the trade-off between accuracy and privacy. Specifically, the findings reveal that introducing noise in the later local layers offers the most favorable balance between accuracy and privacy. Drawing from our insights in the shallower layers, we propose an approach to reduce the size of smashed data to minimize data leakage while maintaining higher accuracy, optimizing the accuracy-privacy trade-off. Additionally, smashed data of a smaller size reduces communication overhead on the client side, mitigating one of the notable drawbacks of SL. Intensive experiments on various datasets demonstrate that our proposed approaches provide an optimal trade-off for incorporating DP into SL, ultimately enhancing the training accuracy for multi-client SL with varying privacy requirements.

查看原文本刊更多论文

差分私有分割学习中准确性与隐私权衡的增强

拆分学习（SL）旨在通过在客户机-服务器之间分布深度模型并将私有数据保存在本地来保护用户数据隐私。在SL过程中，只有经过处理或“粉碎”的数据才能从客户端传输到服务器。然而，最近提出的模型反转攻击可以从被破坏的数据中恢复原始数据。为了加强对此类攻击的隐私保护，一种策略是采用差分隐私（DP），以牺牲一定的准确性为代价保护被破坏的数据。本文首次调查了在具有各种隐私要求的SL中培训多个客户机时对准确性的影响。随后，我们提出了一种在客户端训练期间审查其他客户端的DP噪声分布的方法，以解决已识别的精度下降问题。我们还研究了DP在SL局部模型中的应用，以深入了解准确性和隐私之间的权衡。具体来说，研究结果表明，在后面的局部层引入噪声在准确性和隐私性之间提供了最有利的平衡。根据我们在较浅层的见解，我们提出了一种方法来减少破碎数据的大小，以最大限度地减少数据泄漏，同时保持更高的准确性，优化准确性与隐私之间的权衡。此外，较小尺寸的破碎数据减少了客户端的通信开销，减轻了SL的一个显着缺点。对各种数据集的深入实验表明，我们提出的方法为将DP纳入SL提供了最佳权衡，最终提高了具有不同隐私要求的多客户端SL的训练准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.