Hybrid framework of differential privacy and secure multi-party computation for privacy-preserving entity resolution

IF 5.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Maxwell Dorgbefu Jnr , Yaw Marfo Missah , Najim Ussiph , Gaddafi Abdul-Salaam , Oliver Kornyo , Joseph Mawulorm Mensah
{"title":"Hybrid framework of differential privacy and secure multi-party computation for privacy-preserving entity resolution","authors":"Maxwell Dorgbefu Jnr ,&nbsp;Yaw Marfo Missah ,&nbsp;Najim Ussiph ,&nbsp;Gaddafi Abdul-Salaam ,&nbsp;Oliver Kornyo ,&nbsp;Joseph Mawulorm Mensah","doi":"10.1016/j.cose.2025.104603","DOIUrl":null,"url":null,"abstract":"<div><div>The exponential improvement and precision in hardware design, coupled with sophisticated software systems, are the basis of unprecedented rates of data generation and storage. However, extracting actionable knowledge, formulating impactful policies, and making insightful decisions from these massive datasets rely on data integration with entity resolution as its core task. Despite significant advances in entity resolution methods, the risk of data breaches, matching accuracy, utility and scalability remain critical challenges to the data science research community. This study introduces a novel hybrid framework of differential privacy (DP) and secure multi-party computation (SMPC) for privacy-preserving entity resolution (PPER), thereby addressing critical data utility and confidentiality challenges. We rigorously evaluated the framework using the Febrl4 and North Carolina Voter Registration (NCVR) datasets across three supervised machine learning models (Logistic Regression, SVM, Naïve Bayes), through adaptive <em>ε</em>-allocation (0.1 to 5.0), demonstrating the crucial privacy-utility trade-off. Our findings reveal that the framework maintains high linkage utility, with F1-scores consistently above 0.81 even under stringent privacy budgets (ϵ=0.1), and achieving over 0.90 at moderate ϵ values, notably with support vector machine exhibiting robust performance. This research provides empirical evidence and theoretical guarantees for developing highly practical and ethically compliant PPER solutions, offering clear guidance for balancing data utility with privacy requirements across diverse application domains.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"157 ","pages":"Article 104603"},"PeriodicalIF":5.4000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825002925","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The exponential improvement and precision in hardware design, coupled with sophisticated software systems, are the basis of unprecedented rates of data generation and storage. However, extracting actionable knowledge, formulating impactful policies, and making insightful decisions from these massive datasets rely on data integration with entity resolution as its core task. Despite significant advances in entity resolution methods, the risk of data breaches, matching accuracy, utility and scalability remain critical challenges to the data science research community. This study introduces a novel hybrid framework of differential privacy (DP) and secure multi-party computation (SMPC) for privacy-preserving entity resolution (PPER), thereby addressing critical data utility and confidentiality challenges. We rigorously evaluated the framework using the Febrl4 and North Carolina Voter Registration (NCVR) datasets across three supervised machine learning models (Logistic Regression, SVM, Naïve Bayes), through adaptive ε-allocation (0.1 to 5.0), demonstrating the crucial privacy-utility trade-off. Our findings reveal that the framework maintains high linkage utility, with F1-scores consistently above 0.81 even under stringent privacy budgets (ϵ=0.1), and achieving over 0.90 at moderate ϵ values, notably with support vector machine exhibiting robust performance. This research provides empirical evidence and theoretical guarantees for developing highly practical and ethically compliant PPER solutions, offering clear guidance for balancing data utility with privacy requirements across diverse application domains.
基于差分隐私和安全多方计算的保护隐私实体解析混合框架
硬件设计的指数级改进和精度,加上复杂的软件系统,是前所未有的数据生成和存储速度的基础。然而,从这些海量数据集中提取可操作的知识,制定有影响力的政策,并做出有洞察力的决策,依赖于以实体解析为核心任务的数据集成。尽管实体解析方法取得了重大进展,但数据泄露风险、匹配准确性、实用性和可扩展性仍然是数据科学研究界面临的关键挑战。本研究引入了一种新的差分隐私(DP)和安全多方计算(SMPC)混合框架,用于隐私保护实体解析(PPER),从而解决关键数据效用和机密性挑战。我们使用2月14日和北卡罗来纳州选民登记(NCVR)数据集,通过三种监督机器学习模型(逻辑回归、SVM、Naïve贝叶斯),通过自适应ε-分配(0.1至5.0)严格评估了该框架,证明了关键的隐私-效用权衡。我们的研究结果表明,该框架保持了很高的联系效用,即使在严格的隐私预算下,f1得分也始终高于0.81 (ε =0.1),并且在适度的ε值下达到0.90以上,特别是支持向量机表现出稳健的性能。本研究为开发高度实用和道德合规的PPER解决方案提供了经验证据和理论保证,为平衡不同应用领域的数据效用和隐私要求提供了明确的指导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Security
Computers & Security 工程技术-计算机:信息系统
CiteScore
12.40
自引率
7.10%
发文量
365
审稿时长
10.7 months
期刊介绍: Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信