A Novel Differentially Private Online Learning Algorithm for Group Lasso in Big Data

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

IET Information Security Pub Date : 2024-10-24 DOI:10.1049/2024/5553292

Jinxia Li, Liwei Lu

{"title":"A Novel Differentially Private Online Learning Algorithm for Group Lasso in Big Data","authors":"Jinxia Li, Liwei Lu","doi":"10.1049/2024/5553292","DOIUrl":null,"url":null,"abstract":"<p>This study addresses the challenge of extracting valuable information and selecting key variables from large datasets, essential across statistics, computational science, and data science. In the age of big data, where safeguarding personal privacy is paramount, this study presents an online learning algorithm that leverages differential privacy to handle large-scale data effectively. The focus is on enhancing the online group lasso approach within the differential privacy realm. The study begins by comparing online and offline learning approaches and classifying common online learning techniques. It proceeds to elucidate the concept of differential privacy and its importance. By enhancing the group-follow-the-proximally-regularized-leader (GFTPRL) algorithm, we have created a new method for the online group lasso model that integrates differential privacy for binary classification in logistic regression. The research offers a solid validation of the algorithm’s effectiveness based on differential privacy and online learning principles. The algorithm’s performance was thoroughly evaluated through simulations with both synthetic and actual data. The comparison is made between the proposed privacy-preserving algorithm and traditional non-privacy-preserving counterparts, with a focus on regret bounds, a measure of performance. The findings underscore the practical benefits of the differential privacy-preserving algorithm in tackling large-scale data analysis while upholding privacy standards. This research marks a significant step forward in the fusion of big data analytics and the safeguarding of individual privacy.</p>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2024 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5553292","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Information Security","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/2024/5553292","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

This study addresses the challenge of extracting valuable information and selecting key variables from large datasets, essential across statistics, computational science, and data science. In the age of big data, where safeguarding personal privacy is paramount, this study presents an online learning algorithm that leverages differential privacy to handle large-scale data effectively. The focus is on enhancing the online group lasso approach within the differential privacy realm. The study begins by comparing online and offline learning approaches and classifying common online learning techniques. It proceeds to elucidate the concept of differential privacy and its importance. By enhancing the group-follow-the-proximally-regularized-leader (GFTPRL) algorithm, we have created a new method for the online group lasso model that integrates differential privacy for binary classification in logistic regression. The research offers a solid validation of the algorithm’s effectiveness based on differential privacy and online learning principles. The algorithm’s performance was thoroughly evaluated through simulations with both synthetic and actual data. The comparison is made between the proposed privacy-preserving algorithm and traditional non-privacy-preserving counterparts, with a focus on regret bounds, a measure of performance. The findings underscore the practical benefits of the differential privacy-preserving algorithm in tackling large-scale data analysis while upholding privacy standards. This research marks a significant step forward in the fusion of big data analytics and the safeguarding of individual privacy.

Abstract Image

查看原文本刊更多论文

大数据中群体套索的新型差分私有在线学习算法

本研究解决了从大型数据集中提取有价值信息和选择关键变量的难题，这对统计学、计算科学和数据科学至关重要。在大数据时代，保护个人隐私至关重要，因此本研究提出了一种在线学习算法，利用差分隐私来有效处理大规模数据。重点是在差分隐私领域增强在线群套索方法。本研究首先比较了在线和离线学习方法，并对常见的在线学习技术进行了分类。接着阐明了差异隐私的概念及其重要性。通过增强分组跟随近似正则化领导者（GFTPRL）算法，我们为在线分组拉索模型创建了一种新方法，该方法在逻辑回归的二元分类中整合了差分隐私。这项研究基于差分隐私和在线学习原理，为算法的有效性提供了可靠的验证。通过使用合成数据和实际数据进行模拟，对算法的性能进行了全面评估。比较了所提出的隐私保护算法和传统的非隐私保护算法，重点是衡量性能的遗憾界限。研究结果强调了差分隐私保护算法在处理大规模数据分析的同时又能维护隐私标准的实际优势。这项研究标志着大数据分析与个人隐私保护的融合向前迈出了重要一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IET Information Security 工程技术-计算机：理论方法

CiteScore

3.80

自引率

7.10%

发文量

审稿时长

8.6 months

期刊介绍： IET Information Security publishes original research papers in the following areas of information security and cryptography. Submitting authors should specify clearly in their covering statement the area into which their paper falls. Scope: Access Control and Database Security Ad-Hoc Network Aspects Anonymity and E-Voting Authentication Block Ciphers and Hash Functions Blockchain, Bitcoin (Technical aspects only) Broadcast Encryption and Traitor Tracing Combinatorial Aspects Covert Channels and Information Flow Critical Infrastructures Cryptanalysis Dependability Digital Rights Management Digital Signature Schemes Digital Steganography Economic Aspects of Information Security Elliptic Curve Cryptography and Number Theory Embedded Systems Aspects Embedded Systems Security and Forensics Financial Cryptography Firewall Security Formal Methods and Security Verification Human Aspects Information Warfare and Survivability Intrusion Detection Java and XML Security Key Distribution Key Management Malware Multi-Party Computation and Threshold Cryptography Peer-to-peer Security PKIs Public-Key and Hybrid Encryption Quantum Cryptography Risks of using Computers Robust Networks Secret Sharing Secure Electronic Commerce Software Obfuscation Stream Ciphers Trust Models Watermarking and Fingerprinting Special Issues. Current Call for Papers: Security on Mobile and IoT devices - https://digital-library.theiet.org/files/IET_IFS_SMID_CFP.pdf