Doyoung Kim , Heonho Kim , Seungwan Park , Hanju Kim , Myungha Cho , Seongbin Park , Taewoong Ryu, Chanhee Lee, Hyeonmo Kim, Unil Yun
{"title":"Efficient mining of incremental high utility patterns with negative unit profits over all the accumulated stream data","authors":"Doyoung Kim , Heonho Kim , Seungwan Park , Hanju Kim , Myungha Cho , Seongbin Park , Taewoong Ryu, Chanhee Lee, Hyeonmo Kim, Unil Yun","doi":"10.1016/j.knosys.2025.113956","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional high utility pattern mining had considered that items in databases have positive unit profits, but considering negative unit profits is often required in real life. Thus, many algorithms considering both positive and negative unit profits have been proposed in static data environments. Meanwhile, one of the most important parts of data analysis is how to handle the accumulated stream data in real-world systems. However, existing methods considering negative unit profits in a static environment are inadequate for processing data streams, as they require repeated data access, incurring additional resources with multiple data scans. This paper suggests an effective method considering positive and negative unit profits and dynamic databases for high utility stream pattern mining. To avoid storing data in memory and scanning it multiple times, the proposed approach constructs its data structure by performing a single scan of the incremental data without storing it in the memory. Then, through a reconstruction process, it efficiently integrates and manages the new data while optimally maintaining the structures. This methodology enables efficient mining without the loss of significant patterns. Experiments with real and synthetic datasets show that the proposed approach has improved performance to state-of-the-art methods, including adjusted approaches, regarding runtime, memory usage, and scalability. In addition, the proposed method demonstrates enhanced performance than the baseline method in terms of the resources of each process and the number of incremental databases. Further statistical evaluation of the accuracy test shows that the proposed method extracts results without pattern loss or duplication.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"325 ","pages":"Article 113956"},"PeriodicalIF":7.6000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125010019","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Traditional high utility pattern mining had considered that items in databases have positive unit profits, but considering negative unit profits is often required in real life. Thus, many algorithms considering both positive and negative unit profits have been proposed in static data environments. Meanwhile, one of the most important parts of data analysis is how to handle the accumulated stream data in real-world systems. However, existing methods considering negative unit profits in a static environment are inadequate for processing data streams, as they require repeated data access, incurring additional resources with multiple data scans. This paper suggests an effective method considering positive and negative unit profits and dynamic databases for high utility stream pattern mining. To avoid storing data in memory and scanning it multiple times, the proposed approach constructs its data structure by performing a single scan of the incremental data without storing it in the memory. Then, through a reconstruction process, it efficiently integrates and manages the new data while optimally maintaining the structures. This methodology enables efficient mining without the loss of significant patterns. Experiments with real and synthetic datasets show that the proposed approach has improved performance to state-of-the-art methods, including adjusted approaches, regarding runtime, memory usage, and scalability. In addition, the proposed method demonstrates enhanced performance than the baseline method in terms of the resources of each process and the number of incremental databases. Further statistical evaluation of the accuracy test shows that the proposed method extracts results without pattern loss or duplication.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.