{"title":"Preserving privacy in association rule mining using multi-threshold particle swarm optimization","authors":"Shahad Aljehani , Youseef Alotaibi","doi":"10.1016/j.ins.2024.121673","DOIUrl":null,"url":null,"abstract":"<div><div>Healthcare data has become a powerful resource for generating insights that drive medical research. Association Rule Mining (ARM) techniques are widely used to identify relationships among diseases, treatments, and symptoms. However, sensitive information is often exposed, creating significant privacy challenges, particularly when data is integrated from multiple sources. Although Privacy-Preserving Association Rule Mining (PPARM) methods have been developed to address these issues, most rely on a single, predefined Minimum Support Threshold (MST) that is inflexible in adapting to diverse rule patterns. In this study, a Multi-Threshold Particle Swarm Optimization for Association Rule Mining (MPSO4ARM) model is introduced, integrating the Apriori and Particle Swarm Optimization (PSO) algorithms to perform data mining while protecting sensitive rules. A novel approach is employed by the proposed model to dynamically adjust the MST, allowing for more adaptive and effective privacy preservation. The MPSO4ARM model adjusts the MST on-the-fly based on rule length, improving its ability to safeguard sensitive data across various datasets. The proposed model was evaluated on the Chess, Mushroom, Retail, and Heart Disease datasets. The experimental results showed that the MPSO4ARM model outperforms traditional Apriori and conventional PSO algorithms, achieving higher fitness values and reducing side effects such as Hiding Failure (HF) and Missing Cost (MC), particularly in the Heart Disease and Mushroom datasets. Although the dynamic MST function introduces a moderate increase in computational runtime compared to Apriori and conventional PSO, this trade-off between execution time and enhanced privacy protection is considered acceptable, given the model's substantial improvements in data utility and rule sanitization.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"692 ","pages":"Article 121673"},"PeriodicalIF":8.1000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025524015871","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Healthcare data has become a powerful resource for generating insights that drive medical research. Association Rule Mining (ARM) techniques are widely used to identify relationships among diseases, treatments, and symptoms. However, sensitive information is often exposed, creating significant privacy challenges, particularly when data is integrated from multiple sources. Although Privacy-Preserving Association Rule Mining (PPARM) methods have been developed to address these issues, most rely on a single, predefined Minimum Support Threshold (MST) that is inflexible in adapting to diverse rule patterns. In this study, a Multi-Threshold Particle Swarm Optimization for Association Rule Mining (MPSO4ARM) model is introduced, integrating the Apriori and Particle Swarm Optimization (PSO) algorithms to perform data mining while protecting sensitive rules. A novel approach is employed by the proposed model to dynamically adjust the MST, allowing for more adaptive and effective privacy preservation. The MPSO4ARM model adjusts the MST on-the-fly based on rule length, improving its ability to safeguard sensitive data across various datasets. The proposed model was evaluated on the Chess, Mushroom, Retail, and Heart Disease datasets. The experimental results showed that the MPSO4ARM model outperforms traditional Apriori and conventional PSO algorithms, achieving higher fitness values and reducing side effects such as Hiding Failure (HF) and Missing Cost (MC), particularly in the Heart Disease and Mushroom datasets. Although the dynamic MST function introduces a moderate increase in computational runtime compared to Apriori and conventional PSO, this trade-off between execution time and enhanced privacy protection is considered acceptable, given the model's substantial improvements in data utility and rule sanitization.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.