{"title":"A novel pruning approach using expert knowledge","authors":"A. M. Mahmood, M. Kuppa","doi":"10.1109/INTERACT.2010.5706189","DOIUrl":null,"url":null,"abstract":"Many traditional pruning methods assume that all the datasets are equally probable and equally important. Thus, they apply equal pruning to all the datasets. However, in real-world classification problems, all the datasets are not equal. Consequently, considering equal pruning rate tends to generate inefficient and large size decision trees. Therefore, we propose a practical algorithm to deal with the data specific classification problem when there are datasets with different properties. In this paper, First, we computed the data specific pruning values for each dataset. Then, we used expert knowledge to find inexact pruning value. Finally, we integrated those values in a well established pruning technique to form Expert Knowledge based Pruning (EKBP). We empirically validated the analysis with publicly available 40 datasets from UCI on four existing techniques. Both the analytical and experimental results have shown that our proposed method achieves reduction of tree size and retains equal or better accuracy.","PeriodicalId":201931,"journal":{"name":"INTERACT-2010","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"INTERACT-2010","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INTERACT.2010.5706189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Many traditional pruning methods assume that all the datasets are equally probable and equally important. Thus, they apply equal pruning to all the datasets. However, in real-world classification problems, all the datasets are not equal. Consequently, considering equal pruning rate tends to generate inefficient and large size decision trees. Therefore, we propose a practical algorithm to deal with the data specific classification problem when there are datasets with different properties. In this paper, First, we computed the data specific pruning values for each dataset. Then, we used expert knowledge to find inexact pruning value. Finally, we integrated those values in a well established pruning technique to form Expert Knowledge based Pruning (EKBP). We empirically validated the analysis with publicly available 40 datasets from UCI on four existing techniques. Both the analytical and experimental results have shown that our proposed method achieves reduction of tree size and retains equal or better accuracy.