{"title":"DP-PartFIM:利用差异隐私和分区挖掘常项集","authors":"Xinyu Liu;Wensheng Gan;Lele Yu;Yining Liu","doi":"10.1109/TETC.2024.3443060","DOIUrl":null,"url":null,"abstract":"Itemset mining is a popular data mining technique for extracting interesting and valuable information from large datasets. However, since datasets contain sensitive private data, it is not permitted to directly mine the data or share the mining results. Previous privacy-preserving frequent itemset mining research was not efficient because of the use of privacy budgets or long transaction truncation strategies, which are impractical for large datasets. In this article, we propose a more efficient partition mining technology, DP-PartFIM, based on differential privacy, which protects privacy while mining data. DP-PartFIM uses partition mining to mine frequent itemsets and constructs vertical data storage formats for each partition, which makes the algorithm equally efficient for large datasets. To protect data privacy, DP-PartFIM adds Laplace noise to support candidate itemsets. The experimental results show that, compared with the classical privacy-preserving itemset mining methods, DP-PartFIM better guarantees data utility and privacy.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"567-577"},"PeriodicalIF":5.4000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DP-PartFIM: Frequent Itemset Mining Using Differential Privacy and Partition\",\"authors\":\"Xinyu Liu;Wensheng Gan;Lele Yu;Yining Liu\",\"doi\":\"10.1109/TETC.2024.3443060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Itemset mining is a popular data mining technique for extracting interesting and valuable information from large datasets. However, since datasets contain sensitive private data, it is not permitted to directly mine the data or share the mining results. Previous privacy-preserving frequent itemset mining research was not efficient because of the use of privacy budgets or long transaction truncation strategies, which are impractical for large datasets. In this article, we propose a more efficient partition mining technology, DP-PartFIM, based on differential privacy, which protects privacy while mining data. DP-PartFIM uses partition mining to mine frequent itemsets and constructs vertical data storage formats for each partition, which makes the algorithm equally efficient for large datasets. To protect data privacy, DP-PartFIM adds Laplace noise to support candidate itemsets. The experimental results show that, compared with the classical privacy-preserving itemset mining methods, DP-PartFIM better guarantees data utility and privacy.\",\"PeriodicalId\":13156,\"journal\":{\"name\":\"IEEE Transactions on Emerging Topics in Computing\",\"volume\":\"13 3\",\"pages\":\"567-577\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Emerging Topics in Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10645744/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10645744/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
DP-PartFIM: Frequent Itemset Mining Using Differential Privacy and Partition
Itemset mining is a popular data mining technique for extracting interesting and valuable information from large datasets. However, since datasets contain sensitive private data, it is not permitted to directly mine the data or share the mining results. Previous privacy-preserving frequent itemset mining research was not efficient because of the use of privacy budgets or long transaction truncation strategies, which are impractical for large datasets. In this article, we propose a more efficient partition mining technology, DP-PartFIM, based on differential privacy, which protects privacy while mining data. DP-PartFIM uses partition mining to mine frequent itemsets and constructs vertical data storage formats for each partition, which makes the algorithm equally efficient for large datasets. To protect data privacy, DP-PartFIM adds Laplace noise to support candidate itemsets. The experimental results show that, compared with the classical privacy-preserving itemset mining methods, DP-PartFIM better guarantees data utility and privacy.
期刊介绍:
IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.