{"title":"Improving accuracy of C4.5 algorithm using split feature reduction model and bagging ensemble for credit card risk prediction","authors":"M. A. Muslim, A. Nurzahputra, B. Prasetiyo","doi":"10.1109/ICOIACT.2018.8350753","DOIUrl":null,"url":null,"abstract":"Giving credit to prospective debtor is determined by the existence of credit scoring. The accuracy of credit scoring to classify the debtor data is very important. The method that can be applied is classification and one of the classification method is decision tree. One of the decision tree algorithm that can be used is C4.5 algorithm. In this paper, the problem that discussed is how to increase the accuracy of C4.5 algorithm to predict credit receipts. The increasing accuracy is conducted by applying the Split Feature Reduction Model and Bagging Ensemble. Split Feature Reduction Model is applied in the preprocessing process which split datasets to the amount of n. In this paper, datasets split into 4 splits. Split 1 consists of 16 features, Split 2 consists of 12 features, Split 3 consists of 8 features, and Split 4 consists of 4 features. Then, C4.5 algorithm is applied to every splits. The best accuracy result by applying split feature reduction model with C4.5 algorithm is in Split 3 amount 73.1%. Then, the best accuracy results obtained by applying the split feature reduction model and bagging ensemble with C4.5 algorithm is in Split 3 amount 75.1%. In comparison to the accuracy of C4.5 algorithm stand alone, the applying of split feature reduction model and bagging ensemble obtained increased accuracy by 4.6%.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"31 1","pages":"141-145"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Information and Communications Technology (ICOIACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIACT.2018.8350753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
Giving credit to prospective debtor is determined by the existence of credit scoring. The accuracy of credit scoring to classify the debtor data is very important. The method that can be applied is classification and one of the classification method is decision tree. One of the decision tree algorithm that can be used is C4.5 algorithm. In this paper, the problem that discussed is how to increase the accuracy of C4.5 algorithm to predict credit receipts. The increasing accuracy is conducted by applying the Split Feature Reduction Model and Bagging Ensemble. Split Feature Reduction Model is applied in the preprocessing process which split datasets to the amount of n. In this paper, datasets split into 4 splits. Split 1 consists of 16 features, Split 2 consists of 12 features, Split 3 consists of 8 features, and Split 4 consists of 4 features. Then, C4.5 algorithm is applied to every splits. The best accuracy result by applying split feature reduction model with C4.5 algorithm is in Split 3 amount 73.1%. Then, the best accuracy results obtained by applying the split feature reduction model and bagging ensemble with C4.5 algorithm is in Split 3 amount 75.1%. In comparison to the accuracy of C4.5 algorithm stand alone, the applying of split feature reduction model and bagging ensemble obtained increased accuracy by 4.6%.