Financial Evaluation Model and Algorithm Based on Data Mining

G. Cheng
{"title":"Financial Evaluation Model and Algorithm Based on Data Mining","authors":"G. Cheng","doi":"10.1145/3510858.3510914","DOIUrl":null,"url":null,"abstract":"With the development of information technology, the traditional financial industry has also entered a period of rapid development. The business scope of financial institutions has expanded dramatically with the technological updates, and the service level and user experience have become higher and higher. However, new credit risk issues inevitably emerge within various areas of the financial market, such as the lending business. The lending business, one of the core businesses of the financial industry, generates huge profits for financial institutions, but is very dependent on the level of risk control. In order to minimize the risk, financial institutions want to use the emerging internet technology to analyze massive data, mine effective information and refine risk indices. Therefore, how to use emerging technologies such as big data and data mining to assess loan defaults is gradually becoming a hot issue for financial institutions and an important research direction. In this paper, 150,000 data records of loan customers are obtained from Kaggle credit score dataset, and data pre-processing is performed by statistical methods to clean the unreasonable data in the dataset, such as duplicate, missing and abnormal values. Using logistic regression algorithm, an interpretable credit evaluation model was built on the user's credit records to predict the default likelihood of the user in the coming years. The final quantitative scoring of loan users' default likelihood helps financial institutions control their risks.","PeriodicalId":6757,"journal":{"name":"2021 IEEE 3rd International Conference on Civil Aviation Safety and Information Technology (ICCASIT)","volume":"33 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 3rd International Conference on Civil Aviation Safety and Information Technology (ICCASIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510858.3510914","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the development of information technology, the traditional financial industry has also entered a period of rapid development. The business scope of financial institutions has expanded dramatically with the technological updates, and the service level and user experience have become higher and higher. However, new credit risk issues inevitably emerge within various areas of the financial market, such as the lending business. The lending business, one of the core businesses of the financial industry, generates huge profits for financial institutions, but is very dependent on the level of risk control. In order to minimize the risk, financial institutions want to use the emerging internet technology to analyze massive data, mine effective information and refine risk indices. Therefore, how to use emerging technologies such as big data and data mining to assess loan defaults is gradually becoming a hot issue for financial institutions and an important research direction. In this paper, 150,000 data records of loan customers are obtained from Kaggle credit score dataset, and data pre-processing is performed by statistical methods to clean the unreasonable data in the dataset, such as duplicate, missing and abnormal values. Using logistic regression algorithm, an interpretable credit evaluation model was built on the user's credit records to predict the default likelihood of the user in the coming years. The final quantitative scoring of loan users' default likelihood helps financial institutions control their risks.
基于数据挖掘的财务评价模型与算法
随着信息技术的发展,传统金融业也进入了高速发展期。随着技术的更新,金融机构的业务范围急剧扩大,服务水平和用户体验也越来越高。然而,新的信用风险问题不可避免地出现在金融市场的各个领域,如贷款业务。贷款业务是金融业的核心业务之一,为金融机构创造了巨大的利润,但对风险控制水平的依赖程度很高。为了最大限度地降低风险,金融机构希望利用新兴的互联网技术分析海量数据,挖掘有效信息,提炼风险指标。因此,如何利用大数据、数据挖掘等新兴技术对贷款违约进行评估,逐渐成为金融机构关注的热点问题和重要研究方向。本文从Kaggle信用评分数据集中获取贷款客户的15万条数据记录,通过统计方法对数据进行预处理,清除数据集中重复、缺失、异常值等不合理数据。利用logistic回归算法,以用户的信用记录为基础,建立可解释的信用评估模型,预测用户未来几年的违约可能性。最终对贷款用户违约可能性进行量化评分,有助于金融机构控制其风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信