Hong-Zhen Zheng, Dian-Hui Chu, D. Zhan, Xiaofei Xu
{"title":"An Efficient Algorithm for Mining Large Item Sets","authors":"Hong-Zhen Zheng, Dian-Hui Chu, D. Zhan, Xiaofei Xu","doi":"10.1109/FSKD.2008.679","DOIUrl":null,"url":null,"abstract":"It propose online mining algorithm ( OMA) which online discover large item sets. Without pre-setting a default threshold, the OMA algorithm achieves its efficiency and threshold-flexibility by calculating item-setspsila counts. It is unnecessary and independent of the default threshold and can flexibly adapt to any userpsilas input threshold. In addition, we propose cluster-based association rule algorithm (CARA) creates cluster tables to aid discovery of large item sets. It only requires a single scan of the database, followed by contrasts with the partial cluster tables. It not only prunes considerable amounts of data reducing the time needed to perform data scans and requiring less contrast, but also ensures the correctness of the mined results. By using the CARA algorithm to create cluster tables in advance, each CPU can be utilized to process a cluster table; thus large item sets can be immediately mined even when the database is very large.","PeriodicalId":208332,"journal":{"name":"2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2008.679","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
It propose online mining algorithm ( OMA) which online discover large item sets. Without pre-setting a default threshold, the OMA algorithm achieves its efficiency and threshold-flexibility by calculating item-setspsila counts. It is unnecessary and independent of the default threshold and can flexibly adapt to any userpsilas input threshold. In addition, we propose cluster-based association rule algorithm (CARA) creates cluster tables to aid discovery of large item sets. It only requires a single scan of the database, followed by contrasts with the partial cluster tables. It not only prunes considerable amounts of data reducing the time needed to perform data scans and requiring less contrast, but also ensures the correctness of the mined results. By using the CARA algorithm to create cluster tables in advance, each CPU can be utilized to process a cluster table; thus large item sets can be immediately mined even when the database is very large.