{"title":"基于PSI的更快、更安全、更高效的协作私有数据清理","authors":"Zhaowang Hu , Jun Ye , Zhengqi Zhang","doi":"10.1016/j.cose.2025.104701","DOIUrl":null,"url":null,"abstract":"<div><div>Mislabeled datasets are common in the detection of software malicious behaviors in the real world. When two different Security Operation Centers (SOCs) classify the same malware attack into different threat categories due to differing detection methodologies, this creates significant challenges and security risks for subsequent operations. Through collaborative, both parties aim to align their datasets by filtering out severely misclassified or erroneously labeled entries while preserving privacy. In this privacy-preserving collaborative data cleaning scenario, each party can only learn intersection contents and misclassified items within the intersection, without obtaining any private information about non-intersection data entries. To address this challenge, we propose a novel Secure and Efficient Collaborative Private Data Cleaning Scheme (SCPDC). The scheme comprises two phases: an offline phase responsible for pre-generating computationally expensive share tuples and label encoding operations, and an online phase that utilizes these pre-generated share tuples and encoded vectors to execute a variant-labeled PSI protocol for identifying misclassified items in the intersection. SCPDC achieves an exceptionally efficient online phase while fulfilling privacy requirements for both parties. Security analysis and experimental results demonstrate that SCPDC offers reasonable execution time and lower communication overhead compared to existing related works.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"159 ","pages":"Article 104701"},"PeriodicalIF":5.4000,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Faster secure and efficient collaborative private data cleaning based on PSI\",\"authors\":\"Zhaowang Hu , Jun Ye , Zhengqi Zhang\",\"doi\":\"10.1016/j.cose.2025.104701\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Mislabeled datasets are common in the detection of software malicious behaviors in the real world. When two different Security Operation Centers (SOCs) classify the same malware attack into different threat categories due to differing detection methodologies, this creates significant challenges and security risks for subsequent operations. Through collaborative, both parties aim to align their datasets by filtering out severely misclassified or erroneously labeled entries while preserving privacy. In this privacy-preserving collaborative data cleaning scenario, each party can only learn intersection contents and misclassified items within the intersection, without obtaining any private information about non-intersection data entries. To address this challenge, we propose a novel Secure and Efficient Collaborative Private Data Cleaning Scheme (SCPDC). The scheme comprises two phases: an offline phase responsible for pre-generating computationally expensive share tuples and label encoding operations, and an online phase that utilizes these pre-generated share tuples and encoded vectors to execute a variant-labeled PSI protocol for identifying misclassified items in the intersection. SCPDC achieves an exceptionally efficient online phase while fulfilling privacy requirements for both parties. Security analysis and experimental results demonstrate that SCPDC offers reasonable execution time and lower communication overhead compared to existing related works.</div></div>\",\"PeriodicalId\":51004,\"journal\":{\"name\":\"Computers & Security\",\"volume\":\"159 \",\"pages\":\"Article 104701\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167404825003906\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825003906","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Faster secure and efficient collaborative private data cleaning based on PSI
Mislabeled datasets are common in the detection of software malicious behaviors in the real world. When two different Security Operation Centers (SOCs) classify the same malware attack into different threat categories due to differing detection methodologies, this creates significant challenges and security risks for subsequent operations. Through collaborative, both parties aim to align their datasets by filtering out severely misclassified or erroneously labeled entries while preserving privacy. In this privacy-preserving collaborative data cleaning scenario, each party can only learn intersection contents and misclassified items within the intersection, without obtaining any private information about non-intersection data entries. To address this challenge, we propose a novel Secure and Efficient Collaborative Private Data Cleaning Scheme (SCPDC). The scheme comprises two phases: an offline phase responsible for pre-generating computationally expensive share tuples and label encoding operations, and an online phase that utilizes these pre-generated share tuples and encoded vectors to execute a variant-labeled PSI protocol for identifying misclassified items in the intersection. SCPDC achieves an exceptionally efficient online phase while fulfilling privacy requirements for both parties. Security analysis and experimental results demonstrate that SCPDC offers reasonable execution time and lower communication overhead compared to existing related works.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.