{"title":"Features Selection for Credit Risk Prediction Problem","authors":"Ines Gasmi, Sana Neji, Salima Smiti, Makram Soui","doi":"10.1007/s10796-024-10559-x","DOIUrl":null,"url":null,"abstract":"<p>Credit risk assessment has drawn great interests from both researcher studies and financial institutions. In fact, classifying an applicant as defaulter or non-defaulter customer helps banks to make a reasonable decision. The classification of applicants is based on a set of historical information of past loans. Data sets for analysis may include different features, many of which may be irrelevant to the decision making process. Keeping irrelevant features or leaving out relevant ones may be harmful, causing generation of poor quality patterns that may lead to confusion decision. Determining an appropriate set of predictors is an important challenge in credit risk prediction research which guarantees better decision-making. It is the task of searching the smallest subset of features that provide the highest accuracy and comprehensibility. Thus, this study proposes feature selection-based classification model on credit risk assessment. To this end, five algorithms are applied, Speed-constrained Multi-objective PSO (SMPSO), Non-dominated Sorting Algorithm (NSGA-II), Sequential Forward Selection (SFS), Sequential Forward Floating Selection (SFFS), and Random Subset Feature Selection (RSFS). The selected subset is evaluated based on three classifiers K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Artificial Neural Network (ANN). Our proposed model is validated using three real-world credit datasets. The obtained results confirm the efficiency of SMPSO-KNN model to select the most significant features and provide the highest classification accuracy compared to existing models.</p>","PeriodicalId":13610,"journal":{"name":"Information Systems Frontiers","volume":"28 1","pages":""},"PeriodicalIF":6.9000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems Frontiers","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10796-024-10559-x","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Credit risk assessment has drawn great interests from both researcher studies and financial institutions. In fact, classifying an applicant as defaulter or non-defaulter customer helps banks to make a reasonable decision. The classification of applicants is based on a set of historical information of past loans. Data sets for analysis may include different features, many of which may be irrelevant to the decision making process. Keeping irrelevant features or leaving out relevant ones may be harmful, causing generation of poor quality patterns that may lead to confusion decision. Determining an appropriate set of predictors is an important challenge in credit risk prediction research which guarantees better decision-making. It is the task of searching the smallest subset of features that provide the highest accuracy and comprehensibility. Thus, this study proposes feature selection-based classification model on credit risk assessment. To this end, five algorithms are applied, Speed-constrained Multi-objective PSO (SMPSO), Non-dominated Sorting Algorithm (NSGA-II), Sequential Forward Selection (SFS), Sequential Forward Floating Selection (SFFS), and Random Subset Feature Selection (RSFS). The selected subset is evaluated based on three classifiers K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Artificial Neural Network (ANN). Our proposed model is validated using three real-world credit datasets. The obtained results confirm the efficiency of SMPSO-KNN model to select the most significant features and provide the highest classification accuracy compared to existing models.
期刊介绍:
The interdisciplinary interfaces of Information Systems (IS) are fast emerging as defining areas of research and development in IS. These developments are largely due to the transformation of Information Technology (IT) towards networked worlds and its effects on global communications and economies. While these developments are shaping the way information is used in all forms of human enterprise, they are also setting the tone and pace of information systems of the future. The major advances in IT such as client/server systems, the Internet and the desktop/multimedia computing revolution, for example, have led to numerous important vistas of research and development with considerable practical impact and academic significance. While the industry seeks to develop high performance IS/IT solutions to a variety of contemporary information support needs, academia looks to extend the reach of IS technology into new application domains. Information Systems Frontiers (ISF) aims to provide a common forum of dissemination of frontline industrial developments of substantial academic value and pioneering academic research of significant practical impact.