CG-FHAUI: an efficient algorithm for simultaneously mining succinct pattern sets of frequent high average utility itemsets

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems Pub Date : 2024-05-07 DOI:10.1007/s10115-024-02121-7

Hai Duong, Tin Truong, Bac Le, Philippe Fournier-Viger

{"title":"CG-FHAUI: an efficient algorithm for simultaneously mining succinct pattern sets of frequent high average utility itemsets","authors":"Hai Duong, Tin Truong, Bac Le, Philippe Fournier-Viger","doi":"10.1007/s10115-024-02121-7","DOIUrl":null,"url":null,"abstract":"The identification of both closed frequent high average utility itemsets (CFHAUIs) and generators of frequent high average utility itemsets (GFHAUIs) has substantial significance because they play an essential and concise role in representing frequent high average utility itemsets (FHAUIs). These concise summaries offer a compact yet crucial overview that can be much smaller. In addition, they allow the generation of non-redundant high average utility association rules, a crucial factor for decision-makers to consider. However, difficulty arises from the complexity of discovering these representations, primarily because the average utility function does not satisfy both monotonic and anti-monotonic properties within each equivalence class, that is for itemsets sharing the same subset of transactions. To tackle this challenge, this paper proposes an innovative method for efficiently extracting CFHAUIs and GFHAUIs. This approach introduces novel bounds on the average utility, including a weak lower bound called \\(wlbau\\) and a lower bound named \\(auvlb\\). Efficient pruning strategies are also designed with the aim of early elimination of non-closed and/or non-generator FHAUIs based on the \\(wlbau\\) and \\(auvlb\\) bounds, leading to quicker execution and lower memory consumption. Additionally, the paper introduces a novel algorithm, CG-FHAUI, designed to concurrently discover both GFHAUIs and CFHAUIs. Empirical results highlight the superior performance of the proposed algorithm in terms of runtime, memory usage, and scalability when compared to a baseline algorithm.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"42 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10115-024-02121-7","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The identification of both closed frequent high average utility itemsets (CFHAUIs) and generators of frequent high average utility itemsets (GFHAUIs) has substantial significance because they play an essential and concise role in representing frequent high average utility itemsets (FHAUIs). These concise summaries offer a compact yet crucial overview that can be much smaller. In addition, they allow the generation of non-redundant high average utility association rules, a crucial factor for decision-makers to consider. However, difficulty arises from the complexity of discovering these representations, primarily because the average utility function does not satisfy both monotonic and anti-monotonic properties within each equivalence class, that is for itemsets sharing the same subset of transactions. To tackle this challenge, this paper proposes an innovative method for efficiently extracting CFHAUIs and GFHAUIs. This approach introduces novel bounds on the average utility, including a weak lower bound called \(wlbau\) and a lower bound named \(auvlb\). Efficient pruning strategies are also designed with the aim of early elimination of non-closed and/or non-generator FHAUIs based on the \(wlbau\) and \(auvlb\) bounds, leading to quicker execution and lower memory consumption. Additionally, the paper introduces a novel algorithm, CG-FHAUI, designed to concurrently discover both GFHAUIs and CFHAUIs. Empirical results highlight the superior performance of the proposed algorithm in terms of runtime, memory usage, and scalability when compared to a baseline algorithm.

Abstract Image

查看原文本刊更多论文

CG-FHAUI：同时挖掘频繁高平均效用项集的简洁模式集的高效算法

识别封闭式频繁高平均效用项集（CFHAUIs）和频繁高平均效用项集生成器（GFHAUIs）具有重要意义，因为它们在表示频繁高平均效用项集（FHAUIs）方面发挥着重要而简洁的作用。这些简洁的摘要提供了一个紧凑而重要的概述，可以缩小很多。此外，它们还允许生成非冗余的高平均效用关联规则，这是决策者需要考虑的一个关键因素。然而，发现这些表征的复杂性带来了困难，这主要是因为平均效用函数在每个等价类（即共享相同事务子集的项目集）中并不同时满足单调性和反单调性。为了应对这一挑战，本文提出了一种创新方法，用于高效提取 CFHAUI 和 GFHAUI。这种方法对平均效用引入了新的约束，包括名为 \(wlbau\) 的弱下限和名为 \(auvlb\) 的下限。本文还设计了高效的剪枝策略，目的是基于 \(wlbau\) 和 \(auvlb\) 界值及早消除非封闭和/或非生成的 FHAUI，从而加快执行速度并降低内存消耗。此外，本文还介绍了一种新型算法 CG-FHAUI，旨在同时发现 GFHAUI 和 CFHAUI。实证结果表明，与基线算法相比，所提出的算法在运行时间、内存使用和可扩展性方面都具有卓越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge and Information Systems 工程技术-计算机：人工智能

CiteScore

5.70

自引率

7.40%

发文量

152

审稿时长

7.2 months

期刊介绍： Knowledge and Information Systems (KAIS) provides an international forum for researchers and professionals to share their knowledge and report new advances on all topics related to knowledge systems and advanced information systems. This monthly peer-reviewed archival journal publishes state-of-the-art research reports on emerging topics in KAIS, reviews of important techniques in related areas, and application papers of interest to a general readership.