{"title":"Taxonomy of tree based classification algorithm","authors":"D. Gupta, Dilpreet Singh Kohli, Rajni Jindal","doi":"10.1109/ICCCT.2011.6075191","DOIUrl":null,"url":null,"abstract":"In this paper we are suggesting improvements over an existing C4.5 Algorithm. This is a very popular tree based classification algorithm, used to generate decision tree from a set of training examples. The heuristic function used in this algorithm is based on the concept of information entropy. We are proposing two new heuristic functions which are better than the one used by C4.5 Algorithm by some way or the other. First heuristic function is better in terms of execution time. Second heuristic function is more realistic, gives importance to realistic attributes and thus gives more accurate and reasonable results. So in this way we are proposing two new improvements over J48/C4.5 Algorithm. Throughout the paper we will be using two case studies (examples), one of weather and the other one of student classification for comparing the performance of algorithms.","PeriodicalId":285986,"journal":{"name":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT.2011.6075191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In this paper we are suggesting improvements over an existing C4.5 Algorithm. This is a very popular tree based classification algorithm, used to generate decision tree from a set of training examples. The heuristic function used in this algorithm is based on the concept of information entropy. We are proposing two new heuristic functions which are better than the one used by C4.5 Algorithm by some way or the other. First heuristic function is better in terms of execution time. Second heuristic function is more realistic, gives importance to realistic attributes and thus gives more accurate and reasonable results. So in this way we are proposing two new improvements over J48/C4.5 Algorithm. Throughout the paper we will be using two case studies (examples), one of weather and the other one of student classification for comparing the performance of algorithms.