Utility-Driven Data Analytics Algorithm for Transaction Modifications Using Pre-Large Concept With Single Database Scan

IF 5.7 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Big Data Pub Date : 2025-04-01 DOI:10.1109/TBDATA.2025.3556615

Unil Yun;Hanju Kim;Myungha Cho;Taewoong Ryu;Seungwan Park;Doyoon Kim;Doyoung Kim;Chanhee Lee;Witold Pedrycz

{"title":"Utility-Driven Data Analytics Algorithm for Transaction Modifications Using Pre-Large Concept With Single Database Scan","authors":"Unil Yun;Hanju Kim;Myungha Cho;Taewoong Ryu;Seungwan Park;Doyoon Kim;Doyoung Kim;Chanhee Lee;Witold Pedrycz","doi":"10.1109/TBDATA.2025.3556615","DOIUrl":null,"url":null,"abstract":"Utility-driven pattern analysis is a fundamental method for analyzing noteworthy patterns with high utility for diverse quantitative transactional databases. Recently, various approaches have emerged to handle large, dynamic database environments more efficiently by reducing the number of data scans and pattern expansion operations with the pre-large concept. However, existing pre-large-based high utility pattern mining methods either fail to handle real-time transaction modifications or require additional data scans to validate candidate patterns. In this paper, we propose a novel efficient utility-driven pattern mining algorithm using the pre-large concept for transaction modifications. Our method incorporates a single-scan-based framework through the management of actual utility values and discovers high utility patterns without candidate generation for efficient utility-driven dynamic data analysis in the modification environment. We compared the performance of the proposed method with state-of-the-art methods through extensive performance evaluation utilizing real and synthetic datasets. According to the evaluation results and a case study, the suggested method performs a minimum of 1.5 times faster than state-of-the-art methods alongside minimal compromise in memory, and it scaled well with increases in database size. Further statistical analyses indicate that the proposed method reduces the pattern search space compared to the previous method while delivering a complete set of accurate results without loss.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2792-2808"},"PeriodicalIF":5.7000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10946869/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Utility-driven pattern analysis is a fundamental method for analyzing noteworthy patterns with high utility for diverse quantitative transactional databases. Recently, various approaches have emerged to handle large, dynamic database environments more efficiently by reducing the number of data scans and pattern expansion operations with the pre-large concept. However, existing pre-large-based high utility pattern mining methods either fail to handle real-time transaction modifications or require additional data scans to validate candidate patterns. In this paper, we propose a novel efficient utility-driven pattern mining algorithm using the pre-large concept for transaction modifications. Our method incorporates a single-scan-based framework through the management of actual utility values and discovers high utility patterns without candidate generation for efficient utility-driven dynamic data analysis in the modification environment. We compared the performance of the proposed method with state-of-the-art methods through extensive performance evaluation utilizing real and synthetic datasets. According to the evaluation results and a case study, the suggested method performs a minimum of 1.5 times faster than state-of-the-art methods alongside minimal compromise in memory, and it scaled well with increases in database size. Further statistical analyses indicate that the proposed method reduces the pattern search space compared to the previous method while delivering a complete set of accurate results without loss.

查看原文本刊更多论文

使用单个数据库扫描的Pre-Large概念的事务修改的效用驱动数据分析算法

效用驱动的模式分析是分析各种定量事务数据库中具有高效用的重要模式的基本方法。最近，出现了各种方法，通过使用pre-large概念减少数据扫描和模式展开操作的数量，从而更有效地处理大型动态数据库环境。然而，现有的pre-large-based高实用模式挖掘方法要么无法处理实时事务修改，要么需要额外的数据扫描来验证候选模式。在本文中，我们提出了一种新的高效实用驱动的模式挖掘算法，该算法使用pre-large概念进行事务修改。我们的方法结合了一个基于单一扫描的框架，通过对实际效用值的管理，发现高效用模式，而不需要在修改环境中为有效的效用驱动的动态数据分析生成候选模式。我们通过利用真实和合成数据集进行广泛的性能评估，将所提出的方法的性能与最先进的方法进行了比较。根据评估结果和一个案例研究，建议的方法的执行速度比最先进的方法至少快1.5倍，同时对内存的损害最小，并且随着数据库大小的增加而扩展得很好。进一步的统计分析表明，与之前的方法相比，所提出的方法减少了模式搜索空间，同时提供了一组完整的准确结果而没有损失。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.