Maxwell Dorgbefu Jnr , Yaw Marfo Missah , Najim Ussiph , Gaddafi Abdul-Salaam , Oliver Kornyo , Joseph Mawulorm Mensah
{"title":"基于差分隐私和安全多方计算的保护隐私实体解析混合框架","authors":"Maxwell Dorgbefu Jnr , Yaw Marfo Missah , Najim Ussiph , Gaddafi Abdul-Salaam , Oliver Kornyo , Joseph Mawulorm Mensah","doi":"10.1016/j.cose.2025.104603","DOIUrl":null,"url":null,"abstract":"<div><div>The exponential improvement and precision in hardware design, coupled with sophisticated software systems, are the basis of unprecedented rates of data generation and storage. However, extracting actionable knowledge, formulating impactful policies, and making insightful decisions from these massive datasets rely on data integration with entity resolution as its core task. Despite significant advances in entity resolution methods, the risk of data breaches, matching accuracy, utility and scalability remain critical challenges to the data science research community. This study introduces a novel hybrid framework of differential privacy (DP) and secure multi-party computation (SMPC) for privacy-preserving entity resolution (PPER), thereby addressing critical data utility and confidentiality challenges. We rigorously evaluated the framework using the Febrl4 and North Carolina Voter Registration (NCVR) datasets across three supervised machine learning models (Logistic Regression, SVM, Naïve Bayes), through adaptive <em>ε</em>-allocation (0.1 to 5.0), demonstrating the crucial privacy-utility trade-off. Our findings reveal that the framework maintains high linkage utility, with F1-scores consistently above 0.81 even under stringent privacy budgets (ϵ=0.1), and achieving over 0.90 at moderate ϵ values, notably with support vector machine exhibiting robust performance. This research provides empirical evidence and theoretical guarantees for developing highly practical and ethically compliant PPER solutions, offering clear guidance for balancing data utility with privacy requirements across diverse application domains.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"157 ","pages":"Article 104603"},"PeriodicalIF":5.4000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hybrid framework of differential privacy and secure multi-party computation for privacy-preserving entity resolution\",\"authors\":\"Maxwell Dorgbefu Jnr , Yaw Marfo Missah , Najim Ussiph , Gaddafi Abdul-Salaam , Oliver Kornyo , Joseph Mawulorm Mensah\",\"doi\":\"10.1016/j.cose.2025.104603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The exponential improvement and precision in hardware design, coupled with sophisticated software systems, are the basis of unprecedented rates of data generation and storage. However, extracting actionable knowledge, formulating impactful policies, and making insightful decisions from these massive datasets rely on data integration with entity resolution as its core task. Despite significant advances in entity resolution methods, the risk of data breaches, matching accuracy, utility and scalability remain critical challenges to the data science research community. This study introduces a novel hybrid framework of differential privacy (DP) and secure multi-party computation (SMPC) for privacy-preserving entity resolution (PPER), thereby addressing critical data utility and confidentiality challenges. We rigorously evaluated the framework using the Febrl4 and North Carolina Voter Registration (NCVR) datasets across three supervised machine learning models (Logistic Regression, SVM, Naïve Bayes), through adaptive <em>ε</em>-allocation (0.1 to 5.0), demonstrating the crucial privacy-utility trade-off. Our findings reveal that the framework maintains high linkage utility, with F1-scores consistently above 0.81 even under stringent privacy budgets (ϵ=0.1), and achieving over 0.90 at moderate ϵ values, notably with support vector machine exhibiting robust performance. This research provides empirical evidence and theoretical guarantees for developing highly practical and ethically compliant PPER solutions, offering clear guidance for balancing data utility with privacy requirements across diverse application domains.</div></div>\",\"PeriodicalId\":51004,\"journal\":{\"name\":\"Computers & Security\",\"volume\":\"157 \",\"pages\":\"Article 104603\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167404825002925\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825002925","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Hybrid framework of differential privacy and secure multi-party computation for privacy-preserving entity resolution
The exponential improvement and precision in hardware design, coupled with sophisticated software systems, are the basis of unprecedented rates of data generation and storage. However, extracting actionable knowledge, formulating impactful policies, and making insightful decisions from these massive datasets rely on data integration with entity resolution as its core task. Despite significant advances in entity resolution methods, the risk of data breaches, matching accuracy, utility and scalability remain critical challenges to the data science research community. This study introduces a novel hybrid framework of differential privacy (DP) and secure multi-party computation (SMPC) for privacy-preserving entity resolution (PPER), thereby addressing critical data utility and confidentiality challenges. We rigorously evaluated the framework using the Febrl4 and North Carolina Voter Registration (NCVR) datasets across three supervised machine learning models (Logistic Regression, SVM, Naïve Bayes), through adaptive ε-allocation (0.1 to 5.0), demonstrating the crucial privacy-utility trade-off. Our findings reveal that the framework maintains high linkage utility, with F1-scores consistently above 0.81 even under stringent privacy budgets (ϵ=0.1), and achieving over 0.90 at moderate ϵ values, notably with support vector machine exhibiting robust performance. This research provides empirical evidence and theoretical guarantees for developing highly practical and ethically compliant PPER solutions, offering clear guidance for balancing data utility with privacy requirements across diverse application domains.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.