Mining High Utility Itemsets with Hill Climbing and Simulated Annealing

M. Nawaz, Philippe Fournier-Viger, Unil Yun, Youxi Wu, Wei Song
{"title":"Mining High Utility Itemsets with Hill Climbing and Simulated Annealing","authors":"M. Nawaz, Philippe Fournier-Viger, Unil Yun, Youxi Wu, Wei Song","doi":"10.1145/3462636","DOIUrl":null,"url":null,"abstract":"High utility itemset mining (HUIM) is the task of finding all items set, purchased together, that generate a high profit in a transaction database. In the past, several algorithms have been developed to mine high utility itemsets (HUIs). However, most of them cannot properly handle the exponential search space while finding HUIs when the size of the database and total number of items increases. Recently, evolutionary and heuristic algorithms were designed to mine HUIs, which provided considerable performance improvement. However, they can still have a long runtime and some may miss many HUIs. To address this problem, this article proposes two algorithms for HUIM based on Hill Climbing (HUIM-HC) and Simulated Annealing (HUIM-SA). Both algorithms transform the input database into a bitmap for efficient utility computation and for search space pruning. To improve population diversity, HUIs discovered by evolution are used as target values for the next population instead of keeping the current optimal values in the next population. Through experiments on real-life datasets, it was found that the proposed algorithms are faster than state-of-the-art heuristic and evolutionary HUIM algorithms, that HUIM-SA discovers similar HUIs, and that HUIM-SA evolves linearly with the number of iterations.","PeriodicalId":157366,"journal":{"name":"ACM Transactions on Management Information System (TMIS)","volume":"83 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Management Information System (TMIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3462636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

High utility itemset mining (HUIM) is the task of finding all items set, purchased together, that generate a high profit in a transaction database. In the past, several algorithms have been developed to mine high utility itemsets (HUIs). However, most of them cannot properly handle the exponential search space while finding HUIs when the size of the database and total number of items increases. Recently, evolutionary and heuristic algorithms were designed to mine HUIs, which provided considerable performance improvement. However, they can still have a long runtime and some may miss many HUIs. To address this problem, this article proposes two algorithms for HUIM based on Hill Climbing (HUIM-HC) and Simulated Annealing (HUIM-SA). Both algorithms transform the input database into a bitmap for efficient utility computation and for search space pruning. To improve population diversity, HUIs discovered by evolution are used as target values for the next population instead of keeping the current optimal values in the next population. Through experiments on real-life datasets, it was found that the proposed algorithms are faster than state-of-the-art heuristic and evolutionary HUIM algorithms, that HUIM-SA discovers similar HUIs, and that HUIM-SA evolves linearly with the number of iterations.
利用爬坡和模拟退火技术挖掘高实用项目集
高效用项目集挖掘(HUIM)是在交易数据库中找到所有一起购买并产生高利润的项目集的任务。过去,已经开发了几种算法来挖掘高效用项集(hui)。然而,当数据库规模和项目总数增加时,大多数方法在寻找hui时不能很好地处理指数搜索空间。近年来,设计了进化算法和启发式算法来挖掘hui,这些算法在性能上有很大的提高。然而,它们的运行时间仍然很长,有些可能会错过许多hui。为了解决这一问题,本文提出了两种基于爬山(HUIM- hc)和模拟退火(HUIM- sa)的HUIM算法。这两种算法都将输入数据库转换为位图,以实现高效的效用计算和搜索空间修剪。为了提高种群多样性,将进化发现的hui作为下一个种群的目标值,而不是在下一个种群中保留当前的最优值。通过在真实数据集上的实验,发现所提出的算法比最先进的启发式和进化HUIM算法更快,HUIM- sa发现相似的HUIM,并且HUIM- sa随着迭代次数线性进化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信