Deep convolutional neural network based synthetic minority over sampling technique: a forfending model for fraudulent credit card transactions in financial institution

L. G. Salaudeen, D. Gabi, M. Garba, H. Suru
{"title":"Deep convolutional neural network based synthetic minority over sampling technique: a forfending model for fraudulent credit card transactions in financial institution","authors":"L. G. Salaudeen, D. Gabi, M. Garba, H. Suru","doi":"10.46481/jnsps.2024.2037","DOIUrl":null,"url":null,"abstract":"\n\n\nFraudulent credit card transactions are committed by unauthorized individuals and organizations employing methods such as phishing and social engineering fraud tactics. Researchers propose several Machine Learning (ML) techniques to deter the challenges of credit card fraud. However, the ML approaches are endorsed with some challenges, which makes the detection of credit card fraud extremely difficult. This study proposes a Deep Convolutional Neural Network (DCNN) with Synthetic Minority Oversampling Techniques (SMOTE) as an ideal solution. Kaggle datasets with 284,807 records and 31 features were exploited. Implementation was performed on the Google Colab cloud-based platform, embedding a Jupyter notebook setting with Graphical Processing Units (GPUs). Two experiments were conducted; the first was probed to determine suitable models among baseline models: Logistic Regression (LR), Random Forest (RF), Isolation Forest, and a single Deep Learning (DL) model of Multiple Layer Perceptron (MLP). The baseline models yielded an overfitting accuracy score, with recall, specificity, precision, and F1-score all presenting 1.00% respectively. This outcome is not sufficient in establishing findings on imbalanced data distribution as it's biased. This led to the construction of a new ML model incorporating Light Gradient Boosting Machine (LGBM), with Artificial Neural Network (ANN) and the proposed DCNN+SMOTE for the second experimental phase alongside baseline models. Experimental results via simulation show the proposed DCNN+SMOTE yielded awesome superclass performance across the board, displaying 1.00% results respectively. Its Error Rate (ER) and Null Error Rate (NER) are 0.00% distinctly. Meanwhile, the False Positive Rate (FPR) yields a 0.001% result, lesser and better than the baseline models.\n\n\n","PeriodicalId":342917,"journal":{"name":"Journal of the Nigerian Society of Physical Sciences","volume":"122 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Nigerian Society of Physical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46481/jnsps.2024.2037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Fraudulent credit card transactions are committed by unauthorized individuals and organizations employing methods such as phishing and social engineering fraud tactics. Researchers propose several Machine Learning (ML) techniques to deter the challenges of credit card fraud. However, the ML approaches are endorsed with some challenges, which makes the detection of credit card fraud extremely difficult. This study proposes a Deep Convolutional Neural Network (DCNN) with Synthetic Minority Oversampling Techniques (SMOTE) as an ideal solution. Kaggle datasets with 284,807 records and 31 features were exploited. Implementation was performed on the Google Colab cloud-based platform, embedding a Jupyter notebook setting with Graphical Processing Units (GPUs). Two experiments were conducted; the first was probed to determine suitable models among baseline models: Logistic Regression (LR), Random Forest (RF), Isolation Forest, and a single Deep Learning (DL) model of Multiple Layer Perceptron (MLP). The baseline models yielded an overfitting accuracy score, with recall, specificity, precision, and F1-score all presenting 1.00% respectively. This outcome is not sufficient in establishing findings on imbalanced data distribution as it's biased. This led to the construction of a new ML model incorporating Light Gradient Boosting Machine (LGBM), with Artificial Neural Network (ANN) and the proposed DCNN+SMOTE for the second experimental phase alongside baseline models. Experimental results via simulation show the proposed DCNN+SMOTE yielded awesome superclass performance across the board, displaying 1.00% results respectively. Its Error Rate (ER) and Null Error Rate (NER) are 0.00% distinctly. Meanwhile, the False Positive Rate (FPR) yields a 0.001% result, lesser and better than the baseline models.
基于合成少数过度采样技术的深度卷积神经网络:金融机构欺诈性信用卡交易的防范模型
未经授权的个人和组织采用网络钓鱼和社交工程欺诈策略等方法进行欺诈性信用卡交易。研究人员提出了多种机器学习(ML)技术,以应对信用卡欺诈的挑战。然而,机器学习方法也面临着一些挑战,使得信用卡欺诈的检测变得异常困难。本研究提出了一种采用合成少数群体过采样技术(SMOTE)的深度卷积神经网络(DCNN)作为理想的解决方案。本研究利用了包含 284 807 条记录和 31 个特征的 Kaggle 数据集。实施工作在谷歌 Colab 云平台上进行,嵌入了图形处理器(GPU)的 Jupyter 笔记本设置。共进行了两项实验:第一项实验是在基线模型中确定合适的模型:逻辑回归(LR)、随机森林(RF)、隔离森林以及多层感知器(MLP)的单一深度学习(DL)模型。基线模型的精确度得分过高,召回率、特异性、精确度和 F1 分数均分别为 1.00%。这一结果不足以确定不平衡数据分布的结论,因为它存在偏差。因此,在第二实验阶段,除了基线模型外,我们还构建了一个新的多线性模型,其中包括轻梯度提升机(LGBM)、人工神经网络(ANN)和拟议的 DCNN+SMOTE 模型。模拟实验结果表明,拟议的 DCNN+SMOTE 全面提高了超类性能,分别达到了 1.00%。其错误率(ER)和无效错误率(NER)分别为 0.00%。同时,误报率(FPR)为 0.001%,比基线模型低且更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信