A Fast Survival Support Vector Regression Approach to Large Scale Credit Scoring via Safe Screening.

IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Big Data Pub Date : 2024-07-23 DOI:10.1089/big.2023.0033
Hong Wang, Ling Hong
{"title":"A Fast Survival Support Vector Regression Approach to Large Scale Credit Scoring via Safe Screening.","authors":"Hong Wang, Ling Hong","doi":"10.1089/big.2023.0033","DOIUrl":null,"url":null,"abstract":"<p><p>Survival models have found wider and wider applications in credit scoring recently due to their ability to estimate the dynamics of risk over time. In this research, we propose a Buckley-James safe sample screening support vector regression (BJS4VR) algorithm to model large-scale survival data by combing the Buckley-James transformation and support vector regression. Different from previous support vector regression survival models, censored samples here are imputed using a censoring unbiased Buckley-James estimator. Safe sample screening is then applied to discard samples that guaranteed to be non-active at the final optimal solution from the original data to improve efficiency. Experimental results on the large-scale real lending club loan data have shown that the proposed BJS4VR model outperforms existing popular survival models such as RSFM, CoxRidge and CoxBoost in terms of both prediction accuracy and time efficiency. Important variables highly correlated with credit risk are also identified with the proposed method.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":null,"pages":null},"PeriodicalIF":2.6000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1089/big.2023.0033","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Survival models have found wider and wider applications in credit scoring recently due to their ability to estimate the dynamics of risk over time. In this research, we propose a Buckley-James safe sample screening support vector regression (BJS4VR) algorithm to model large-scale survival data by combing the Buckley-James transformation and support vector regression. Different from previous support vector regression survival models, censored samples here are imputed using a censoring unbiased Buckley-James estimator. Safe sample screening is then applied to discard samples that guaranteed to be non-active at the final optimal solution from the original data to improve efficiency. Experimental results on the large-scale real lending club loan data have shown that the proposed BJS4VR model outperforms existing popular survival models such as RSFM, CoxRidge and CoxBoost in terms of both prediction accuracy and time efficiency. Important variables highly correlated with credit risk are also identified with the proposed method.

通过安全筛选进行大规模信用评分的快速生存支持向量回归方法。
由于生存模型能够估计随时间变化的风险动态,因此近来在信用评分领域得到了越来越广泛的应用。在这项研究中,我们提出了一种巴克利-詹姆斯安全样本筛选支持向量回归(BJS4VR)算法,通过结合巴克利-詹姆斯变换和支持向量回归,对大规模生存数据进行建模。与以往的支持向量回归生存模型不同,这里的删减样本是使用删减无偏的巴克利-詹姆斯估计器来估算的。然后应用安全样本筛选,从原始数据中剔除保证在最终最优解中不活跃的样本,以提高效率。在大规模真实借贷俱乐部贷款数据上的实验结果表明,所提出的 BJS4VR 模型在预测准确性和时间效率方面都优于现有的流行生存模型,如 RSFM、CoxRidge 和 CoxBoost。此外,所提出的方法还识别出了与信贷风险高度相关的重要变量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Big Data
Big Data COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS
CiteScore
9.10
自引率
2.20%
发文量
60
期刊介绍: Big Data is the leading peer-reviewed journal covering the challenges and opportunities in collecting, analyzing, and disseminating vast amounts of data. The Journal addresses questions surrounding this powerful and growing field of data science and facilitates the efforts of researchers, business managers, analysts, developers, data scientists, physicists, statisticians, infrastructure developers, academics, and policymakers to improve operations, profitability, and communications within their businesses and institutions. Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address current challenges and enforce effective efforts to organize, store, disseminate, protect, manipulate, and, most importantly, find the most effective strategies to make this incredible amount of information work to benefit society, industry, academia, and government. Big Data coverage includes: Big data industry standards, New technologies being developed specifically for big data, Data acquisition, cleaning, distribution, and best practices, Data protection, privacy, and policy, Business interests from research to product, The changing role of business intelligence, Visualization and design principles of big data infrastructures, Physical interfaces and robotics, Social networking advantages for Facebook, Twitter, Amazon, Google, etc, Opportunities around big data and how companies can harness it to their advantage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信