{"title":"一种有效的基于参考点集的双向频繁项集生成方法","authors":"Ambily Balaram, Nedunchezhian Raju","doi":"10.34028/iajit/20/6/6","DOIUrl":null,"url":null,"abstract":"Data Mining (DM) is a combination of several fields that effectively extracts hidden patterns from vast amounts of historical data. One of the DM activities used to produce association rules is Association Rule Mining (ARM). To significantly reduce time and space complexities, the proposed method utilizes an effective bi-directional frequent itemset generation approach. The dataset is explicitly bifurcated into dense and sparse regions in the process of mining frequent itemset. One more feature is proposed in this paper which sensibly predetermines a candidate subset called, Reference-Points-Set (RPS), to reduce the complexities associated with mining of frequent itemsets. The RPS helps to reduce the number of scans over the actual dataset. The novelty is to look at possible candidates during the initial database scans, which can cut down on the number of additional database scans that are required. According to experimental data, the average scan count of the proposed method is respectively, 24% and 65%, lower than that of Dynamic Itemset Counting (DIC) and M-Apriori, across different support counts. The proposed method typically results in a 10% reduction in execution time over DIC and is three times more efficient than M-Apriori. These results significantly outperform those of their predecessors, which strongly supports the proposed approach when creating frequent itemsets from large datasets","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Effective Reference-Point-Set (RPS) Based Bi-Directional Frequent Itemset Generation\",\"authors\":\"Ambily Balaram, Nedunchezhian Raju\",\"doi\":\"10.34028/iajit/20/6/6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data Mining (DM) is a combination of several fields that effectively extracts hidden patterns from vast amounts of historical data. One of the DM activities used to produce association rules is Association Rule Mining (ARM). To significantly reduce time and space complexities, the proposed method utilizes an effective bi-directional frequent itemset generation approach. The dataset is explicitly bifurcated into dense and sparse regions in the process of mining frequent itemset. One more feature is proposed in this paper which sensibly predetermines a candidate subset called, Reference-Points-Set (RPS), to reduce the complexities associated with mining of frequent itemsets. The RPS helps to reduce the number of scans over the actual dataset. The novelty is to look at possible candidates during the initial database scans, which can cut down on the number of additional database scans that are required. According to experimental data, the average scan count of the proposed method is respectively, 24% and 65%, lower than that of Dynamic Itemset Counting (DIC) and M-Apriori, across different support counts. The proposed method typically results in a 10% reduction in execution time over DIC and is three times more efficient than M-Apriori. These results significantly outperform those of their predecessors, which strongly supports the proposed approach when creating frequent itemsets from large datasets\",\"PeriodicalId\":161392,\"journal\":{\"name\":\"The International Arab Journal of Information Technology\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The International Arab Journal of Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34028/iajit/20/6/6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Arab Journal of Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34028/iajit/20/6/6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Effective Reference-Point-Set (RPS) Based Bi-Directional Frequent Itemset Generation
Data Mining (DM) is a combination of several fields that effectively extracts hidden patterns from vast amounts of historical data. One of the DM activities used to produce association rules is Association Rule Mining (ARM). To significantly reduce time and space complexities, the proposed method utilizes an effective bi-directional frequent itemset generation approach. The dataset is explicitly bifurcated into dense and sparse regions in the process of mining frequent itemset. One more feature is proposed in this paper which sensibly predetermines a candidate subset called, Reference-Points-Set (RPS), to reduce the complexities associated with mining of frequent itemsets. The RPS helps to reduce the number of scans over the actual dataset. The novelty is to look at possible candidates during the initial database scans, which can cut down on the number of additional database scans that are required. According to experimental data, the average scan count of the proposed method is respectively, 24% and 65%, lower than that of Dynamic Itemset Counting (DIC) and M-Apriori, across different support counts. The proposed method typically results in a 10% reduction in execution time over DIC and is three times more efficient than M-Apriori. These results significantly outperform those of their predecessors, which strongly supports the proposed approach when creating frequent itemsets from large datasets