机器学习方法在金融欺诈检测中的应用比较分析

A. Menshchikov, V. Perfilev, Denis Roenko, M. Zykin, Maksim Fedosenko
{"title":"机器学习方法在金融欺诈检测中的应用比较分析","authors":"A. Menshchikov, V. Perfilev, Denis Roenko, M. Zykin, Maksim Fedosenko","doi":"10.23919/FRUCT56874.2022.9953872","DOIUrl":null,"url":null,"abstract":"This paper addresses the fraud detection problem in the context of Big Data used in remote banking systems. The paper aims to propose a new algorithm for automatic detection of fraudulent transactions using machine learning with a performance that allows to apply it in big data systems. The article identifies promising directions for optimizing the operation of methods for fraudulent transactions detection in anti-fraud systems. Architectural approaches to the operation of anti-fraud systems have been studied. Based on this, an architecture for illegal actions prediction in a near real-time mode was proposed. The research task of the article is to find the most suitable machine learning algorithm, with the least training and prediction time, demonstrating high classification performance. To achieve this goal, an analysis of the supervised and ensemble machine learning algorithms was made. The dataset was preprocessed for the experiment with SMOTE resampling and robust scaling techniques. The chosen methods were compared using different metrics: $F$1 score, AUC and time consumption for training and classification. As a result of a metrics comparison, it was found that multilayer perceptron (MLP) and boosting methods (Adaptive, Gradient, XGBoost) has the highest classification, but MLP outperforms boosting methods in terms of time consumption for classification. Thus, MLP was selected as the most appropriate algorithm for further integration to proposed Big Data architecture. Based on the data obtained during the experiments, the degree of their implementation in fraud detection systems was assessed and architecture for the anti-fraud detection system for big data was proposed.","PeriodicalId":274664,"journal":{"name":"2022 32nd Conference of Open Innovations Association (FRUCT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparative Analysis of Machine Learning Methods Application for Financial Fraud Detection\",\"authors\":\"A. Menshchikov, V. Perfilev, Denis Roenko, M. Zykin, Maksim Fedosenko\",\"doi\":\"10.23919/FRUCT56874.2022.9953872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the fraud detection problem in the context of Big Data used in remote banking systems. The paper aims to propose a new algorithm for automatic detection of fraudulent transactions using machine learning with a performance that allows to apply it in big data systems. The article identifies promising directions for optimizing the operation of methods for fraudulent transactions detection in anti-fraud systems. Architectural approaches to the operation of anti-fraud systems have been studied. Based on this, an architecture for illegal actions prediction in a near real-time mode was proposed. The research task of the article is to find the most suitable machine learning algorithm, with the least training and prediction time, demonstrating high classification performance. To achieve this goal, an analysis of the supervised and ensemble machine learning algorithms was made. The dataset was preprocessed for the experiment with SMOTE resampling and robust scaling techniques. The chosen methods were compared using different metrics: $F$1 score, AUC and time consumption for training and classification. As a result of a metrics comparison, it was found that multilayer perceptron (MLP) and boosting methods (Adaptive, Gradient, XGBoost) has the highest classification, but MLP outperforms boosting methods in terms of time consumption for classification. Thus, MLP was selected as the most appropriate algorithm for further integration to proposed Big Data architecture. Based on the data obtained during the experiments, the degree of their implementation in fraud detection systems was assessed and architecture for the anti-fraud detection system for big data was proposed.\",\"PeriodicalId\":274664,\"journal\":{\"name\":\"2022 32nd Conference of Open Innovations Association (FRUCT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 32nd Conference of Open Innovations Association (FRUCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/FRUCT56874.2022.9953872\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 32nd Conference of Open Innovations Association (FRUCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/FRUCT56874.2022.9953872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文讨论了在远程银行系统中使用的大数据背景下的欺诈检测问题。本文旨在提出一种利用机器学习自动检测欺诈性交易的新算法,其性能允许将其应用于大数据系统。本文确定了优化反欺诈系统中欺诈交易检测方法操作的有希望的方向。研究了反欺诈系统运行的体系结构方法。在此基础上,提出了一种近实时模式下的非法行为预测体系结构。本文的研究任务是找到最合适的机器学习算法,用最少的训练和预测时间,表现出较高的分类性能。为了实现这一目标,对监督和集成机器学习算法进行了分析。采用SMOTE重采样和鲁棒缩放技术对数据集进行预处理。选择的方法使用不同的指标进行比较:$F$1分数,AUC和训练和分类的时间消耗。作为指标比较的结果,我们发现多层感知器(MLP)和增强方法(Adaptive, Gradient, XGBoost)具有最高的分类能力,但MLP在分类耗时方面优于增强方法。因此,选择MLP作为最合适的算法,进一步集成到所提出的大数据架构中。根据实验中获得的数据,评估了它们在欺诈检测系统中的实现程度,并提出了大数据反欺诈检测系统的架构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative Analysis of Machine Learning Methods Application for Financial Fraud Detection
This paper addresses the fraud detection problem in the context of Big Data used in remote banking systems. The paper aims to propose a new algorithm for automatic detection of fraudulent transactions using machine learning with a performance that allows to apply it in big data systems. The article identifies promising directions for optimizing the operation of methods for fraudulent transactions detection in anti-fraud systems. Architectural approaches to the operation of anti-fraud systems have been studied. Based on this, an architecture for illegal actions prediction in a near real-time mode was proposed. The research task of the article is to find the most suitable machine learning algorithm, with the least training and prediction time, demonstrating high classification performance. To achieve this goal, an analysis of the supervised and ensemble machine learning algorithms was made. The dataset was preprocessed for the experiment with SMOTE resampling and robust scaling techniques. The chosen methods were compared using different metrics: $F$1 score, AUC and time consumption for training and classification. As a result of a metrics comparison, it was found that multilayer perceptron (MLP) and boosting methods (Adaptive, Gradient, XGBoost) has the highest classification, but MLP outperforms boosting methods in terms of time consumption for classification. Thus, MLP was selected as the most appropriate algorithm for further integration to proposed Big Data architecture. Based on the data obtained during the experiments, the degree of their implementation in fraud detection systems was assessed and architecture for the anti-fraud detection system for big data was proposed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信