{"title":"基于矩阵的改进Apriori算法","authors":"Feng Wang, Yong-hua Li","doi":"10.1109/FBIE.2008.80","DOIUrl":null,"url":null,"abstract":"A priori algorithm is a classical algorithm of association rule mining and also is one of the most important algorithms. But it also has some limitations. It produces overfull candidates of frequent itemsets, so the algorithm needs scan database frequently when finding frequent itemsets. So it must be inefficient. To solve the bottleneck of the a priori algorithm, this paper introduces an improved algorithm based on the matrix. It uses the matrix effectively indicate the affairs in the database and uses the \"AND operation\" to deal with the matrix to produce the largest frequent itemsets and others. It needn't scan the database time and again to lookup the affairs, and also greatly reduce the number of candidates of frequent itemsets. This paper uses an example to analyze and compare the difference between the two algorithms and the result shows that the improved algorithm obtains the bonus time of calculating and promotes the efficiency of computing.","PeriodicalId":415908,"journal":{"name":"2008 International Seminar on Future BioMedical Information Engineering","volume":"43 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"An Improved Apriori Algorithm Based on the Matrix\",\"authors\":\"Feng Wang, Yong-hua Li\",\"doi\":\"10.1109/FBIE.2008.80\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A priori algorithm is a classical algorithm of association rule mining and also is one of the most important algorithms. But it also has some limitations. It produces overfull candidates of frequent itemsets, so the algorithm needs scan database frequently when finding frequent itemsets. So it must be inefficient. To solve the bottleneck of the a priori algorithm, this paper introduces an improved algorithm based on the matrix. It uses the matrix effectively indicate the affairs in the database and uses the \\\"AND operation\\\" to deal with the matrix to produce the largest frequent itemsets and others. It needn't scan the database time and again to lookup the affairs, and also greatly reduce the number of candidates of frequent itemsets. This paper uses an example to analyze and compare the difference between the two algorithms and the result shows that the improved algorithm obtains the bonus time of calculating and promotes the efficiency of computing.\",\"PeriodicalId\":415908,\"journal\":{\"name\":\"2008 International Seminar on Future BioMedical Information Engineering\",\"volume\":\"43 11\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Seminar on Future BioMedical Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FBIE.2008.80\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Seminar on Future BioMedical Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FBIE.2008.80","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A priori algorithm is a classical algorithm of association rule mining and also is one of the most important algorithms. But it also has some limitations. It produces overfull candidates of frequent itemsets, so the algorithm needs scan database frequently when finding frequent itemsets. So it must be inefficient. To solve the bottleneck of the a priori algorithm, this paper introduces an improved algorithm based on the matrix. It uses the matrix effectively indicate the affairs in the database and uses the "AND operation" to deal with the matrix to produce the largest frequent itemsets and others. It needn't scan the database time and again to lookup the affairs, and also greatly reduce the number of candidates of frequent itemsets. This paper uses an example to analyze and compare the difference between the two algorithms and the result shows that the improved algorithm obtains the bonus time of calculating and promotes the efficiency of computing.