Philippe Fournier-Viger, Yimin Zhang, Jerry Chun‐wei Lin, Duy-Tai Dinh, H. Le
{"title":"使用各种度量方法挖掘相关的高效用项集","authors":"Philippe Fournier-Viger, Yimin Zhang, Jerry Chun‐wei Lin, Duy-Tai Dinh, H. Le","doi":"10.1093/jigpal/jzz068","DOIUrl":null,"url":null,"abstract":"Discovering high-utility itemsets consists of finding sets of items that yield a high profit in customer transaction databases. An important limitation of traditional high-utility itemset mining is that only the utility measure is used for assessing the interestingness of patterns. This leads to finding several itemsets that have a high profit but contain items that are weakly correlated. To address this issue, this paper proposes to integrate the concept of correlation in high-utility itemset mining to find profitable itemsets that are highly correlated, using the all-confidence and bond measures. An efficient algorithm named FCHM (Fast Correlated High-utility itemset Miner) is proposed to efficiently discover correlated high-utility itemsets. Two versions of the algorithm are proposed, named FCHMall-confidence and FCHMbond based on the allconfidence and bond measures, respectively. An experimental evaluation was done using four real-life benchmark datasets from the high-utility itemset mining litterature: mushroom, retail, kosarak and foodmart. Results show that FCHM is efficient and can prune a huge amount of weakly correlated high-utility itemsets.","PeriodicalId":304915,"journal":{"name":"Log. J. IGPL","volume":"s3-41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":"{\"title\":\"Mining correlated high-utility itemsets using various measures\",\"authors\":\"Philippe Fournier-Viger, Yimin Zhang, Jerry Chun‐wei Lin, Duy-Tai Dinh, H. Le\",\"doi\":\"10.1093/jigpal/jzz068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Discovering high-utility itemsets consists of finding sets of items that yield a high profit in customer transaction databases. An important limitation of traditional high-utility itemset mining is that only the utility measure is used for assessing the interestingness of patterns. This leads to finding several itemsets that have a high profit but contain items that are weakly correlated. To address this issue, this paper proposes to integrate the concept of correlation in high-utility itemset mining to find profitable itemsets that are highly correlated, using the all-confidence and bond measures. An efficient algorithm named FCHM (Fast Correlated High-utility itemset Miner) is proposed to efficiently discover correlated high-utility itemsets. Two versions of the algorithm are proposed, named FCHMall-confidence and FCHMbond based on the allconfidence and bond measures, respectively. An experimental evaluation was done using four real-life benchmark datasets from the high-utility itemset mining litterature: mushroom, retail, kosarak and foodmart. Results show that FCHM is efficient and can prune a huge amount of weakly correlated high-utility itemsets.\",\"PeriodicalId\":304915,\"journal\":{\"name\":\"Log. J. IGPL\",\"volume\":\"s3-41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"31\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Log. J. IGPL\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jigpal/jzz068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Log. J. IGPL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jigpal/jzz068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mining correlated high-utility itemsets using various measures
Discovering high-utility itemsets consists of finding sets of items that yield a high profit in customer transaction databases. An important limitation of traditional high-utility itemset mining is that only the utility measure is used for assessing the interestingness of patterns. This leads to finding several itemsets that have a high profit but contain items that are weakly correlated. To address this issue, this paper proposes to integrate the concept of correlation in high-utility itemset mining to find profitable itemsets that are highly correlated, using the all-confidence and bond measures. An efficient algorithm named FCHM (Fast Correlated High-utility itemset Miner) is proposed to efficiently discover correlated high-utility itemsets. Two versions of the algorithm are proposed, named FCHMall-confidence and FCHMbond based on the allconfidence and bond measures, respectively. An experimental evaluation was done using four real-life benchmark datasets from the high-utility itemset mining litterature: mushroom, retail, kosarak and foodmart. Results show that FCHM is efficient and can prune a huge amount of weakly correlated high-utility itemsets.