Credit Risk Analysis using LightGBM and a comparative study of popular algorithms

2021 4th International Conference on Computing and Communications Technologies (ICCCT) Pub Date : 2021-12-16 DOI:10.1109/ICCCT53315.2021.9711896

J. Ponsam, S.V. Juno Bella Gracia, G. Geetha, S. Karpaselvi, K. Nimala

{"title":"Credit Risk Analysis using LightGBM and a comparative study of popular algorithms","authors":"J. Ponsam, S.V. Juno Bella Gracia, G. Geetha, S. Karpaselvi, K. Nimala","doi":"10.1109/ICCCT53315.2021.9711896","DOIUrl":null,"url":null,"abstract":"Credit Risk analysis and mitigation have been an area of concern since the 07–08 Financial Crisis. One of the main reasons for the collapse was the high default rates of low-income security loans. Calculating credit scores can be a complicated process for people with thin credit histories or non-existent credit histories. Banks may refuse to give loans if the scores don't satisfy their requirements. Lack of a credit score is considered as an indicator for potential default and hence banks avoid sanctioning loans for people who come under this category. However, banks still offer loans if people are willing to offer securities. Credit Scoring can be done by using state-of-the-art Machine Learning models. Machine Learning and Data Science are becoming increasingly crucial in the fin-tech world. Popular machine learning algorithms such as Random Forest and Linear Support Vector Machines are being used currently. We're looking to explore further into credit risk analysis with LightGBM as our algorithm of choice. It is an open source framework developed by Microsoft in 2017. It is an ensemble model which has several advantages such as better prediction and higher stability. Predictions aggregated from multiple models tend to be less noisy than a single model, this is one of the main reasons why an ensemble model such as LightGBM can perform better than Logistic Regression and other algorithms like SVMs for this use case.","PeriodicalId":162171,"journal":{"name":"2021 4th International Conference on Computing and Communications Technologies (ICCCT)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 4th International Conference on Computing and Communications Technologies (ICCCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT53315.2021.9711896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Credit Risk analysis and mitigation have been an area of concern since the 07–08 Financial Crisis. One of the main reasons for the collapse was the high default rates of low-income security loans. Calculating credit scores can be a complicated process for people with thin credit histories or non-existent credit histories. Banks may refuse to give loans if the scores don't satisfy their requirements. Lack of a credit score is considered as an indicator for potential default and hence banks avoid sanctioning loans for people who come under this category. However, banks still offer loans if people are willing to offer securities. Credit Scoring can be done by using state-of-the-art Machine Learning models. Machine Learning and Data Science are becoming increasingly crucial in the fin-tech world. Popular machine learning algorithms such as Random Forest and Linear Support Vector Machines are being used currently. We're looking to explore further into credit risk analysis with LightGBM as our algorithm of choice. It is an open source framework developed by Microsoft in 2017. It is an ensemble model which has several advantages such as better prediction and higher stability. Predictions aggregated from multiple models tend to be less noisy than a single model, this is one of the main reasons why an ensemble model such as LightGBM can perform better than Logistic Regression and other algorithms like SVMs for this use case.

查看原文本刊更多论文

基于LightGBM的信用风险分析与常用算法的比较研究

自07-08年金融危机以来，信用风险分析和缓解一直是一个令人关注的领域。崩溃的主要原因之一是低收入安全贷款的高违约率。对于信用记录薄或不存在信用记录的人来说，计算信用评分可能是一个复杂的过程。如果分数不符合要求，银行可能会拒绝贷款。缺乏信用评分被认为是潜在违约的一个指标，因此银行避免对属于这一类别的人发放贷款。然而，如果人们愿意提供证券，银行仍然会提供贷款。信用评分可以通过使用最先进的机器学习模型来完成。机器学习和数据科学在金融科技领域变得越来越重要。目前流行的机器学习算法如随机森林和线性支持向量机正在使用。我们希望进一步探索信用风险分析，选择LightGBM作为我们的算法。它是微软在2017年开发的开源框架。它是一种集成模型，具有预测效果好、稳定性高等优点。从多个模型聚合的预测往往比单个模型噪声更小，这就是为什么像LightGBM这样的集成模型在这个用例中比逻辑回归和其他算法(如svm)执行得更好的主要原因之一。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 4th International Conference on Computing and Communications Technologies (ICCCT)

自引率

0.00%

发文量