{"title":"Multi-source data outlier detection based on secure multi-party computation","authors":"Lin Yao , Zhaolong Zheng , Tian Wei , Guowei Wu","doi":"10.1016/j.is.2025.102597","DOIUrl":null,"url":null,"abstract":"<div><div>Outlier detection has been applied to many fields such as financial fraud, fault detection, and health diagnosis as an important technology to discover abnormal data. Data sharing is required to perform outlier detection on multi-source data. However, data sharing between multi-source generally discloses privacy embedded within the data such as sensitive patient information. With the increasing emphasis on personal privacy, it is necessary to study how to achieve outlier detection for multi-source data while preserving privacy. Secure Multi-Party Computation (SMPC) is a privacy-preserving technology to achieve secure calculation between multi-source in the absence of a trusted third party. But due to frequent data interaction, high complexity and low practicability comes with complex calculations. In this paper, we propose a secure multi-source data outlier detection scheme based on SMPC. Our scheme uses homomorphic encryption and perturbation to preserve the critical process of calculating the global distance matrix, which greatly reduces the complexity of the secure calculation process. Besides, we design an outlier determination strategy to reduce the steps of searching reverse neighbors and calculating the final local outlier factor. By comparison, our scheme outperforms the existing schemes in terms of accuracy ratio, running time and efficiency.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"135 ","pages":"Article 102597"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030643792500081X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Outlier detection has been applied to many fields such as financial fraud, fault detection, and health diagnosis as an important technology to discover abnormal data. Data sharing is required to perform outlier detection on multi-source data. However, data sharing between multi-source generally discloses privacy embedded within the data such as sensitive patient information. With the increasing emphasis on personal privacy, it is necessary to study how to achieve outlier detection for multi-source data while preserving privacy. Secure Multi-Party Computation (SMPC) is a privacy-preserving technology to achieve secure calculation between multi-source in the absence of a trusted third party. But due to frequent data interaction, high complexity and low practicability comes with complex calculations. In this paper, we propose a secure multi-source data outlier detection scheme based on SMPC. Our scheme uses homomorphic encryption and perturbation to preserve the critical process of calculating the global distance matrix, which greatly reduces the complexity of the secure calculation process. Besides, we design an outlier determination strategy to reduce the steps of searching reverse neighbors and calculating the final local outlier factor. By comparison, our scheme outperforms the existing schemes in terms of accuracy ratio, running time and efficiency.
期刊介绍:
Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems.
Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.