{"title":"A clustering rule-based approach to predictive modeling","authors":"Philicity Williams, C. Soares, J. Gilbert","doi":"10.1145/1900008.1900071","DOIUrl":null,"url":null,"abstract":"Recent discoveries using rule-based classifiers and pre-learning data clustering have helped improve classification accuracy in predictive modeling tasks. This research introduces a unique approach which combines the above techniques and studies its predictive effects. The algorithm presented in this research, a Clustering Rule-based Algorithm (CRA), first clusters the original training set using an Expectation Maximization (EM) algorithm. Then, a separate Classification and Regression Tree (CART) is trained on each individual cluster. To obtain an upper-bound on accuracy, each test instance is evaluated against all of the rules produced by each separate Tree, to determine if there exists a rule produced by one of the Trees which correctly classifies the test instance. This study reveals that a predictive accuracy of 100% was achievable. Moreover, this approach exploits the advantages of supervised and unsupervised learning to produce a more powerful and more accurate predictive model.","PeriodicalId":333104,"journal":{"name":"ACM SE '10","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SE '10","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1900008.1900071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Recent discoveries using rule-based classifiers and pre-learning data clustering have helped improve classification accuracy in predictive modeling tasks. This research introduces a unique approach which combines the above techniques and studies its predictive effects. The algorithm presented in this research, a Clustering Rule-based Algorithm (CRA), first clusters the original training set using an Expectation Maximization (EM) algorithm. Then, a separate Classification and Regression Tree (CART) is trained on each individual cluster. To obtain an upper-bound on accuracy, each test instance is evaluated against all of the rules produced by each separate Tree, to determine if there exists a rule produced by one of the Trees which correctly classifies the test instance. This study reveals that a predictive accuracy of 100% was achievable. Moreover, this approach exploits the advantages of supervised and unsupervised learning to produce a more powerful and more accurate predictive model.