基于 TabNet-Stacking 的信用违约预测模型研究。

IF 2.1 3区物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY

Entropy Pub Date : 2024-10-13 DOI:10.3390/e26100861

Shijie Wang, Xueyong Zhang

{"title":"基于 TabNet-Stacking 的信用违约预测模型研究。","authors":"Shijie Wang, Xueyong Zhang","doi":"10.3390/e26100861","DOIUrl":null,"url":null,"abstract":"With the development of financial technology, the traditional experience-based and single-network credit default prediction model can no longer meet the current needs. This manuscript proposes a credit default prediction model based on TabNeT-Stacking. First, use the PyTorch deep learning framework to construct an improved TabNet structure. The multi-population genetic algorithm is used to optimize the Attention Transformer automatic feature selection module. The particle swarm algorithm is used to optimize the hyperparameter selection and achieve automatic parameter search. Finally, Stacking ensemble learning is used, and the improved TabNet is used to extract features. XGBoost (eXtreme Gradient Boosting), LightGBM (Light Gradient Boosting Machine), CatBoost (Category Boosting), KNN (K-NearestNeighbor), and SVM (Support Vector Machine) are selected as the first-layer base learners, and XGBoost is used as the second-layer meta-learner. The experimental results show that compared with original models, the credit default prediction model proposed in this manuscript outperforms the comparison models in terms of accuracy, precision, recall, F1 score, and AUC (Area Under the Curve) of credit default prediction results.","PeriodicalId":11694,"journal":{"name":"Entropy","volume":"26 10","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11506879/pdf/","citationCount":"0","resultStr":"{\"title\":\"Research on Credit Default Prediction Model Based on TabNet-Stacking.\",\"authors\":\"Shijie Wang, Xueyong Zhang\",\"doi\":\"10.3390/e26100861\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the development of financial technology, the traditional experience-based and single-network credit default prediction model can no longer meet the current needs. This manuscript proposes a credit default prediction model based on TabNeT-Stacking. First, use the PyTorch deep learning framework to construct an improved TabNet structure. The multi-population genetic algorithm is used to optimize the Attention Transformer automatic feature selection module. The particle swarm algorithm is used to optimize the hyperparameter selection and achieve automatic parameter search. Finally, Stacking ensemble learning is used, and the improved TabNet is used to extract features. XGBoost (eXtreme Gradient Boosting), LightGBM (Light Gradient Boosting Machine), CatBoost (Category Boosting), KNN (K-NearestNeighbor), and SVM (Support Vector Machine) are selected as the first-layer base learners, and XGBoost is used as the second-layer meta-learner. The experimental results show that compared with original models, the credit default prediction model proposed in this manuscript outperforms the comparison models in terms of accuracy, precision, recall, F1 score, and AUC (Area Under the Curve) of credit default prediction results.\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":\"26 10\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11506879/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e26100861\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e26100861","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

随着金融科技的发展，传统的基于经验和单一网络的信用违约预测模型已不能满足当前的需求。本稿件提出了一种基于 TabNeT-Stacking 的信用违约预测模型。首先，利用 PyTorch 深度学习框架构建改进的 TabNet 结构。利用多群体遗传算法优化注意力转换器自动特征选择模块。采用粒子群算法优化超参数选择，实现参数自动搜索。最后，使用堆叠集合学习，并使用改进的 TabNet 提取特征。选择 XGBoost（eXtreme Gradient Boosting）、LightGBM（Light Gradient Boosting Machine）、CatBoost（Category Boosting）、KNN（K-NearestNeighbor）和 SVM（Support Vector Machine）作为第一层基础学习器，XGBoost 作为第二层元学习器。实验结果表明，与原始模型相比，本文提出的信用违约预测模型在信用违约预测结果的准确度、精确度、召回率、F1 分数和 AUC（曲线下面积）等方面均优于对比模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Research on Credit Default Prediction Model Based on TabNet-Stacking.

With the development of financial technology, the traditional experience-based and single-network credit default prediction model can no longer meet the current needs. This manuscript proposes a credit default prediction model based on TabNeT-Stacking. First, use the PyTorch deep learning framework to construct an improved TabNet structure. The multi-population genetic algorithm is used to optimize the Attention Transformer automatic feature selection module. The particle swarm algorithm is used to optimize the hyperparameter selection and achieve automatic parameter search. Finally, Stacking ensemble learning is used, and the improved TabNet is used to extract features. XGBoost (eXtreme Gradient Boosting), LightGBM (Light Gradient Boosting Machine), CatBoost (Category Boosting), KNN (K-NearestNeighbor), and SVM (Support Vector Machine) are selected as the first-layer base learners, and XGBoost is used as the second-layer meta-learner. The experimental results show that compared with original models, the credit default prediction model proposed in this manuscript outperforms the comparison models in terms of accuracy, precision, recall, F1 score, and AUC (Area Under the Curve) of credit default prediction results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Entropy PHYSICS, MULTIDISCIPLINARY-

CiteScore

4.90

自引率

11.10%

发文量

1580

审稿时长

21.05 days

期刊介绍： Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.