{"title":"Integrating Frequent Pattern Mining from Multiple Data Domains for Classification","authors":"D. Patel, W. Hsu, M. Lee","doi":"10.1109/ICDE.2012.63","DOIUrl":null,"url":null,"abstract":"Many frequent pattern mining algorithms have been developed for categorical, numerical, time series, or interval data. However, little attention has been given to integrate these algorithms so as to mine frequent patterns from multiple domain datasets for classification. In this paper, we introduce the notion of a heterogenous pattern to capture the associations among different kinds of data. We propose a unified framework for mining multiple domain datasets and design an iterative algorithm called HTMiner. HTMiner discovers essential heterogenous patterns for classification and performs instance elimination. This instance elimination step reduces the problem size progressively by removing training instances which are correctly covered by the discovered essential heterogenous pattern. Experiments on two real world datasets show that the HTMiner is efficient and can significantly improve the classification accuracy.","PeriodicalId":321608,"journal":{"name":"2012 IEEE 28th International Conference on Data Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 28th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2012.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Many frequent pattern mining algorithms have been developed for categorical, numerical, time series, or interval data. However, little attention has been given to integrate these algorithms so as to mine frequent patterns from multiple domain datasets for classification. In this paper, we introduce the notion of a heterogenous pattern to capture the associations among different kinds of data. We propose a unified framework for mining multiple domain datasets and design an iterative algorithm called HTMiner. HTMiner discovers essential heterogenous patterns for classification and performs instance elimination. This instance elimination step reduces the problem size progressively by removing training instances which are correctly covered by the discovered essential heterogenous pattern. Experiments on two real world datasets show that the HTMiner is efficient and can significantly improve the classification accuracy.