Unmasking Banking Fraud: Unleashing the Power of Machine Learning and Explainable AI (XAI) on Imbalanced Data

Information Pub Date : 2024-05-23 DOI:10.3390/info15060298
S. N. Nobel, Shirin Sultana, Sondip Poul Singha, S. Chaki, M. J. N. Mahi, Tony Jan, Alistair Barros, Md. Whaiduzzaman
{"title":"Unmasking Banking Fraud: Unleashing the Power of Machine Learning and Explainable AI (XAI) on Imbalanced Data","authors":"S. N. Nobel, Shirin Sultana, Sondip Poul Singha, S. Chaki, M. J. N. Mahi, Tony Jan, Alistair Barros, Md. Whaiduzzaman","doi":"10.3390/info15060298","DOIUrl":null,"url":null,"abstract":"Recognizing fraudulent activity in the banking system is essential due to the significant risks involved. When fraudulent transactions are vastly outnumbered by non-fraudulent ones, dealing with imbalanced datasets can be difficult. This study aims to determine the best model for detecting fraud by comparing four commonly used machine learning algorithms: Support Vector Machine (SVM), XGBoost, Decision Tree, and Logistic Regression. Additionally, we utilized the Synthetic Minority Over-sampling Technique (SMOTE) to address the issue of class imbalance. The XGBoost Classifier proved to be the most successful model for fraud detection, with an accuracy of 99.88%. We utilized SHAP and LIME analyses to provide greater clarity into the decision-making process of the XGBoost model and improve overall comprehension. This research shows that the XGBoost Classifier is highly effective in detecting banking fraud on imbalanced datasets, with an impressive accuracy score. The interpretability of the XGBoost Classifier model was further enhanced by applying SHAP and LIME analysis, which shed light on the significant features that contribute to fraud detection. The insights and findings presented here are valuable contributions to the ongoing efforts aimed at developing effective fraud detection systems for the banking industry.","PeriodicalId":510156,"journal":{"name":"Information","volume":"43 49","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info15060298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recognizing fraudulent activity in the banking system is essential due to the significant risks involved. When fraudulent transactions are vastly outnumbered by non-fraudulent ones, dealing with imbalanced datasets can be difficult. This study aims to determine the best model for detecting fraud by comparing four commonly used machine learning algorithms: Support Vector Machine (SVM), XGBoost, Decision Tree, and Logistic Regression. Additionally, we utilized the Synthetic Minority Over-sampling Technique (SMOTE) to address the issue of class imbalance. The XGBoost Classifier proved to be the most successful model for fraud detection, with an accuracy of 99.88%. We utilized SHAP and LIME analyses to provide greater clarity into the decision-making process of the XGBoost model and improve overall comprehension. This research shows that the XGBoost Classifier is highly effective in detecting banking fraud on imbalanced datasets, with an impressive accuracy score. The interpretability of the XGBoost Classifier model was further enhanced by applying SHAP and LIME analysis, which shed light on the significant features that contribute to fraud detection. The insights and findings presented here are valuable contributions to the ongoing efforts aimed at developing effective fraud detection systems for the banking industry.
揭开银行欺诈的面纱:在不平衡数据上释放机器学习和可解释人工智能 (XAI) 的力量
由于涉及重大风险,识别银行系统中的欺诈活动至关重要。当欺诈交易的数量远远超过非欺诈交易时,处理不平衡数据集就会变得非常困难。本研究旨在通过比较四种常用的机器学习算法,确定检测欺诈行为的最佳模型:支持向量机 (SVM)、XGBoost、决策树和逻辑回归。此外,我们还利用合成少数群体过度采样技术(SMOTE)来解决类不平衡的问题。事实证明,XGBoost 分类器是最成功的欺诈检测模型,准确率高达 99.88%。我们利用 SHAP 和 LIME 分析,使 XGBoost 模型的决策过程更加清晰,并提高了整体理解能力。这项研究表明,XGBoost 分类器在不平衡数据集上检测银行欺诈非常有效,准确率令人印象深刻。通过应用 SHAP 和 LIME 分析,进一步提高了 XGBoost 分类器模型的可解释性,从而揭示了有助于欺诈检测的重要特征。本文提出的见解和研究结果对目前银行业开发有效欺诈检测系统的工作做出了宝贵贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信