{"title":"基于SVM的信用评分特征选择","authors":"Ping Yao","doi":"10.1109/CINC.2009.36","DOIUrl":null,"url":null,"abstract":"As the credit industry has been growing rapidly, huge number of consumers’ credit data are collected by the credit department of the bank and credit scoring has become a very important issue. Usually, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model, so, effective feature selection methods are necessary for credit dataset with huge number of features. This paper aims at comparing seven well-known feature selection methods for credit scoring. Which are t-test, principle component analysis (PCA), factor analysis (FA), stepwise regression, Rough Set (RS), Classification and regression tree (CART) and Multivariate adaptive regression splines (MARS). Support vector machine (SVM) is used as the classification model. Two credit scoring databases are used in order to provide a reliable conclusion. Regarding the experimental results, the CART and MARS methods outperform the other methods by the overall accuracy and type I error and type II error.","PeriodicalId":173506,"journal":{"name":"2009 International Conference on Computational Intelligence and Natural Computing","volume":"155 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Feature Selection Based on SVM for Credit Scoring\",\"authors\":\"Ping Yao\",\"doi\":\"10.1109/CINC.2009.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the credit industry has been growing rapidly, huge number of consumers’ credit data are collected by the credit department of the bank and credit scoring has become a very important issue. Usually, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model, so, effective feature selection methods are necessary for credit dataset with huge number of features. This paper aims at comparing seven well-known feature selection methods for credit scoring. Which are t-test, principle component analysis (PCA), factor analysis (FA), stepwise regression, Rough Set (RS), Classification and regression tree (CART) and Multivariate adaptive regression splines (MARS). Support vector machine (SVM) is used as the classification model. Two credit scoring databases are used in order to provide a reliable conclusion. Regarding the experimental results, the CART and MARS methods outperform the other methods by the overall accuracy and type I error and type II error.\",\"PeriodicalId\":173506,\"journal\":{\"name\":\"2009 International Conference on Computational Intelligence and Natural Computing\",\"volume\":\"155 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Computational Intelligence and Natural Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CINC.2009.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Computational Intelligence and Natural Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CINC.2009.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
As the credit industry has been growing rapidly, huge number of consumers’ credit data are collected by the credit department of the bank and credit scoring has become a very important issue. Usually, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model, so, effective feature selection methods are necessary for credit dataset with huge number of features. This paper aims at comparing seven well-known feature selection methods for credit scoring. Which are t-test, principle component analysis (PCA), factor analysis (FA), stepwise regression, Rough Set (RS), Classification and regression tree (CART) and Multivariate adaptive regression splines (MARS). Support vector machine (SVM) is used as the classification model. Two credit scoring databases are used in order to provide a reliable conclusion. Regarding the experimental results, the CART and MARS methods outperform the other methods by the overall accuracy and type I error and type II error.