Xu Zhu , Qingyong Chu , Xinchang Song , Ping Hu , Lu Peng
{"title":"Explainable prediction of loan default based on machine learning models","authors":"Xu Zhu , Qingyong Chu , Xinchang Song , Ping Hu , Lu Peng","doi":"10.1016/j.dsm.2023.04.003","DOIUrl":null,"url":null,"abstract":"<div><p>Owing to the convenience of online loans, an increasing number of people are borrowing money on online platforms. With the emergence of machine learning technology, predicting loan defaults has become a popular topic. However, machine learning models have a black-box problem that cannot be disregarded. To make the prediction model rules more understandable and thereby increase the user’s faith in the model, an explanatory model must be used. Logistic regression, decision tree, XGBoost, and LightGBM models are employed to predict a loan default. The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability. The area under curve for LightGBM is 0.7213. The accuracies of LightGBM and XGBoost exceed 0.8. The precisions of LightGBM and XGBoost exceed 0.55. Simultaneously, we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings. The results show that factors such as the loan term, loan grade, credit rating, and loan amount affect the predicted outcomes.</p></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science and Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666764923000218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Owing to the convenience of online loans, an increasing number of people are borrowing money on online platforms. With the emergence of machine learning technology, predicting loan defaults has become a popular topic. However, machine learning models have a black-box problem that cannot be disregarded. To make the prediction model rules more understandable and thereby increase the user’s faith in the model, an explanatory model must be used. Logistic regression, decision tree, XGBoost, and LightGBM models are employed to predict a loan default. The prediction results show that LightGBM and XGBoost outperform logistic regression and decision tree models in terms of the predictive ability. The area under curve for LightGBM is 0.7213. The accuracies of LightGBM and XGBoost exceed 0.8. The precisions of LightGBM and XGBoost exceed 0.55. Simultaneously, we employed the local interpretable model-agnostic explanations approach to undertake an explainable analysis of the prediction findings. The results show that factors such as the loan term, loan grade, credit rating, and loan amount affect the predicted outcomes.