利用数据挖掘技术检测金融交易中的信用卡欺诈

2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM) Pub Date : 2021-08-25 DOI:10.1109/ICCITM53167.2021.9677867

R. H. Alwan, Murtadha M. Hamad, O. Dawood

{"title":"利用数据挖掘技术检测金融交易中的信用卡欺诈","authors":"R. H. Alwan, Murtadha M. Hamad, O. Dawood","doi":"10.1109/ICCITM53167.2021.9677867","DOIUrl":null,"url":null,"abstract":"Every year, fraudulent credit card transactions result in the loss of billions of dollars. The development of effective fraud detection algorithms is critical for lowering this loss, and more algorithms are turning to advanced data mining approaches to help in fraud detection. Due to the unstable distribution of the data, the design of fraud detection algorithms is very difficult, and the distribution of the categories is highly unbalanced, yet there are many transactions that are categorized by fraud detection system. This paper proposes a system for detection fraud in financial transactions by using some types of data mining models which are logistic regression, random forest, naïve bayes and support vector machine. This is done through suggested basic steps: the first step is to use European cardholder dataset which contains 284.807 transactions that split into two groups. First one contains 199.3649 transactions which is used for training the models, while 85.4421 transactions remained for testing the models. This dataset is highly imbalanced, therefore by using SMOTE technique it will transform to a balanced one. The Second step is preparing the data and apply the Correlation function on training dataset, then implementing the used models on it. The results are compared by evaluation metrics to show which model is the best for detecting fraud. From these results, it is concluded that the Random Forest classifier is the best for fraud detection, which achieved accuracy with 99.15% in testing data.","PeriodicalId":406104,"journal":{"name":"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Credit Card Fraud Detection in Financial Transactions Using Data Mining Techniques\",\"authors\":\"R. H. Alwan, Murtadha M. Hamad, O. Dawood\",\"doi\":\"10.1109/ICCITM53167.2021.9677867\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Every year, fraudulent credit card transactions result in the loss of billions of dollars. The development of effective fraud detection algorithms is critical for lowering this loss, and more algorithms are turning to advanced data mining approaches to help in fraud detection. Due to the unstable distribution of the data, the design of fraud detection algorithms is very difficult, and the distribution of the categories is highly unbalanced, yet there are many transactions that are categorized by fraud detection system. This paper proposes a system for detection fraud in financial transactions by using some types of data mining models which are logistic regression, random forest, naïve bayes and support vector machine. This is done through suggested basic steps: the first step is to use European cardholder dataset which contains 284.807 transactions that split into two groups. First one contains 199.3649 transactions which is used for training the models, while 85.4421 transactions remained for testing the models. This dataset is highly imbalanced, therefore by using SMOTE technique it will transform to a balanced one. The Second step is preparing the data and apply the Correlation function on training dataset, then implementing the used models on it. The results are compared by evaluation metrics to show which model is the best for detecting fraud. From these results, it is concluded that the Random Forest classifier is the best for fraud detection, which achieved accuracy with 99.15% in testing data.\",\"PeriodicalId\":406104,\"journal\":{\"name\":\"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCITM53167.2021.9677867\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITM53167.2021.9677867","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

每年，欺诈性信用卡交易导致数十亿美元的损失。开发有效的欺诈检测算法对于降低这种损失至关重要，越来越多的算法转向先进的数据挖掘方法来帮助欺诈检测。由于数据分布的不稳定，使得欺诈检测算法的设计非常困难，而且类别的分布高度不平衡，但仍有许多交易被欺诈检测系统分类。本文利用逻辑回归、随机森林、naïve贝叶斯和支持向量机等数据挖掘模型，提出了一个金融交易欺诈检测系统。这是通过建议的基本步骤完成的:第一步是使用欧洲持卡人数据集，其中包含分为两组的284.807笔交易。第一个包含199.3649个事务，用于训练模型，而85.4421个事务用于测试模型。该数据集高度不平衡，因此使用SMOTE技术将其转换为平衡数据集。第二步是准备数据并在训练数据集上应用相关函数，然后在其上实现使用的模型。通过评估指标对结果进行比较，以显示哪个模型最适合检测欺诈。从这些结果可以得出结论，随机森林分类器是最好的欺诈检测，在测试数据中达到99.15%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Credit Card Fraud Detection in Financial Transactions Using Data Mining Techniques

Every year, fraudulent credit card transactions result in the loss of billions of dollars. The development of effective fraud detection algorithms is critical for lowering this loss, and more algorithms are turning to advanced data mining approaches to help in fraud detection. Due to the unstable distribution of the data, the design of fraud detection algorithms is very difficult, and the distribution of the categories is highly unbalanced, yet there are many transactions that are categorized by fraud detection system. This paper proposes a system for detection fraud in financial transactions by using some types of data mining models which are logistic regression, random forest, naïve bayes and support vector machine. This is done through suggested basic steps: the first step is to use European cardholder dataset which contains 284.807 transactions that split into two groups. First one contains 199.3649 transactions which is used for training the models, while 85.4421 transactions remained for testing the models. This dataset is highly imbalanced, therefore by using SMOTE technique it will transform to a balanced one. The Second step is preparing the data and apply the Correlation function on training dataset, then implementing the used models on it. The results are compared by evaluation metrics to show which model is the best for detecting fraud. From these results, it is concluded that the Random Forest classifier is the best for fraud detection, which achieved accuracy with 99.15% in testing data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)

自引率

0.00%

发文量