{"title":"A constraint programming approach for quantitative frequent pattern mining","authors":"Mohammed El Amine Laghzaoui, Yahia Lebbah","doi":"10.1504/ijdmmm.2023.132973","DOIUrl":null,"url":null,"abstract":"Itemset mining is the first pattern mining problem studied in the literature. Most of the itemset mining studies have considered only Boolean datasets, where each transaction can contain or not items. In practical applications, items appear in some transactions with some quantities. In this paper, we propose an extension of the current efficient constraint programming approach for itemset mining, to take into account quantitative items in order to find patterns with their quantities directly on the original quantitative dataset. The contribution is two folds. Firstly, we facilitate the modelling task of mining problems through a new constraint. Secondly, we propose a new filtering algorithm to handle the frequency and closeness constraints. Experiments performed on standard benchmark datasets with numerous mining constraints show that our approach enables to find more informative quantitative patterns, which are better in running time than quantitative approaches based on classical Boolean patterns.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"242 1","pages":"0"},"PeriodicalIF":0.4000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining Modelling and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijdmmm.2023.132973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Itemset mining is the first pattern mining problem studied in the literature. Most of the itemset mining studies have considered only Boolean datasets, where each transaction can contain or not items. In practical applications, items appear in some transactions with some quantities. In this paper, we propose an extension of the current efficient constraint programming approach for itemset mining, to take into account quantitative items in order to find patterns with their quantities directly on the original quantitative dataset. The contribution is two folds. Firstly, we facilitate the modelling task of mining problems through a new constraint. Secondly, we propose a new filtering algorithm to handle the frequency and closeness constraints. Experiments performed on standard benchmark datasets with numerous mining constraints show that our approach enables to find more informative quantitative patterns, which are better in running time than quantitative approaches based on classical Boolean patterns.
期刊介绍:
Facilitating transformation from data to information to knowledge is paramount for organisations. Companies are flooded with data and conflicting information, but with limited real usable knowledge. However, rarely should a process be looked at from limited angles or in parts. Isolated islands of data mining, modelling and management (DMMM) should be connected. IJDMMM highlightes integration of DMMM, statistics/machine learning/databases, each element of data chain management, types of information, algorithms in software; from data pre-processing to post-processing; between theory and applications. Topics covered include: -Artificial intelligence- Biomedical science- Business analytics/intelligence, process modelling- Computer science, database management systems- Data management, mining, modelling, warehousing- Engineering- Environmental science, environment (ecoinformatics)- Information systems/technology, telecommunications/networking- Management science, operations research, mathematics/statistics- Social sciences- Business/economics, (computational) finance- Healthcare, medicine, pharmaceuticals- (Computational) chemistry, biology (bioinformatics)- Sustainable mobility systems, intelligent transportation systems- National security