{"title":"基于gan的交易欺诈检测混合抽样方法","authors":"Yu Xie;Junkai Shan;Lifei Wei;Jiamin Yao;MengChu Zhou","doi":"10.1109/TKDE.2025.3589885","DOIUrl":null,"url":null,"abstract":"In the digital era, effective Transaction Fraud Detection (TFD) is essential to ensuring financial security. The considerable class imbalance, with legitimate transactions vastly outnumbering fraudulent ones, presents a significant challenge for TFD models to accurately identify fraudulent patterns. While existing sample-balancing strategies address class imbalance effectively in many contexts, they often fall short in TFD due to fraudsters’ sophisticated concealment tactics, which lead to pronounced behavioral overlap between fraudulent and legitimate transactions. In this paper, we introduce a novel Generative Adversarial Network-based Hybrid Sampling method (GANHS) to effectively address the class imbalance issue. GANHS employs a dual-discriminator generative adversarial network to generate synthetic samples that accurately reflect the characteristics of fraudulent activity, while an adaptive neighborhood-based undersampling technique refines these samples to minimize overlap with legitimate ones. This hybrid approach not only enhances the model’s ability to learn fraud patterns by generating high-quality samples but also improves its resilience against highly concealed fraudulent activities. Experiments on real-world and public datasets demonstrate that GANHS outperforms its competitive peers, with gains of 0.5%–8.7% in average <inline-formula><tex-math>$F_{1}$</tex-math></inline-formula>-Score and 1.0%–7.0% in G-mean, highlighting its strong potential for improving the reliability and effectiveness of TFD systems in complex, high-risk financial scenarios.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 10","pages":"5905-5918"},"PeriodicalIF":10.4000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GAN-Based Hybrid Sampling Method for Transaction Fraud Detection\",\"authors\":\"Yu Xie;Junkai Shan;Lifei Wei;Jiamin Yao;MengChu Zhou\",\"doi\":\"10.1109/TKDE.2025.3589885\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the digital era, effective Transaction Fraud Detection (TFD) is essential to ensuring financial security. The considerable class imbalance, with legitimate transactions vastly outnumbering fraudulent ones, presents a significant challenge for TFD models to accurately identify fraudulent patterns. While existing sample-balancing strategies address class imbalance effectively in many contexts, they often fall short in TFD due to fraudsters’ sophisticated concealment tactics, which lead to pronounced behavioral overlap between fraudulent and legitimate transactions. In this paper, we introduce a novel Generative Adversarial Network-based Hybrid Sampling method (GANHS) to effectively address the class imbalance issue. GANHS employs a dual-discriminator generative adversarial network to generate synthetic samples that accurately reflect the characteristics of fraudulent activity, while an adaptive neighborhood-based undersampling technique refines these samples to minimize overlap with legitimate ones. This hybrid approach not only enhances the model’s ability to learn fraud patterns by generating high-quality samples but also improves its resilience against highly concealed fraudulent activities. Experiments on real-world and public datasets demonstrate that GANHS outperforms its competitive peers, with gains of 0.5%–8.7% in average <inline-formula><tex-math>$F_{1}$</tex-math></inline-formula>-Score and 1.0%–7.0% in G-mean, highlighting its strong potential for improving the reliability and effectiveness of TFD systems in complex, high-risk financial scenarios.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"37 10\",\"pages\":\"5905-5918\"},\"PeriodicalIF\":10.4000,\"publicationDate\":\"2025-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11081459/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11081459/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
GAN-Based Hybrid Sampling Method for Transaction Fraud Detection
In the digital era, effective Transaction Fraud Detection (TFD) is essential to ensuring financial security. The considerable class imbalance, with legitimate transactions vastly outnumbering fraudulent ones, presents a significant challenge for TFD models to accurately identify fraudulent patterns. While existing sample-balancing strategies address class imbalance effectively in many contexts, they often fall short in TFD due to fraudsters’ sophisticated concealment tactics, which lead to pronounced behavioral overlap between fraudulent and legitimate transactions. In this paper, we introduce a novel Generative Adversarial Network-based Hybrid Sampling method (GANHS) to effectively address the class imbalance issue. GANHS employs a dual-discriminator generative adversarial network to generate synthetic samples that accurately reflect the characteristics of fraudulent activity, while an adaptive neighborhood-based undersampling technique refines these samples to minimize overlap with legitimate ones. This hybrid approach not only enhances the model’s ability to learn fraud patterns by generating high-quality samples but also improves its resilience against highly concealed fraudulent activities. Experiments on real-world and public datasets demonstrate that GANHS outperforms its competitive peers, with gains of 0.5%–8.7% in average $F_{1}$-Score and 1.0%–7.0% in G-mean, highlighting its strong potential for improving the reliability and effectiveness of TFD systems in complex, high-risk financial scenarios.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.