{"title":"极端不平衡数据下的信用卡欺诈检测:数据级算法的比较研究","authors":"Amit Singh, R. Ranjan, A. Tiwari","doi":"10.1080/0952813X.2021.1907795","DOIUrl":null,"url":null,"abstract":"ABSTRACT Credit card fraud is one of the biggest cybercrimes faced by users. Intelligent machine learning based fraudulent transaction detection systems are very effective in real-world scenarios. However, while designing these systems, machine learning approaches suffer from the problem of imbalanced data, i.e. imbalanced class distribution. Therefore, balancing the dataset becomes an imperative sub-task. Investigation of state-of-the-art approaches reveals that there is a need for a systematic study of class imbalance handling strategies to design an intelligent and capable system to detect the fraudulent transaction. This work aims to provide a comparative study of different class imbalance handling methods. To compare the effectiveness and efficiency of different class imbalance approaches in conjunction with state-of-the-art classification approaches, we have performed an extensive experimental study. We compared these methods on many performance indicators such as Precision, Recall, K-fold Cross-validation, AUC-ROC curve and execution time. In this study, we found that the Oversampling followed by Undersampling methods performs well for ensemble classification models such as AdaBoost, XGBoost and Random Forest.","PeriodicalId":15677,"journal":{"name":"Journal of Experimental & Theoretical Artificial Intelligence","volume":"1 1","pages":"571 - 598"},"PeriodicalIF":1.7000,"publicationDate":"2021-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Credit Card Fraud Detection under Extreme Imbalanced Data: A Comparative Study of Data-level Algorithms\",\"authors\":\"Amit Singh, R. Ranjan, A. Tiwari\",\"doi\":\"10.1080/0952813X.2021.1907795\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT Credit card fraud is one of the biggest cybercrimes faced by users. Intelligent machine learning based fraudulent transaction detection systems are very effective in real-world scenarios. However, while designing these systems, machine learning approaches suffer from the problem of imbalanced data, i.e. imbalanced class distribution. Therefore, balancing the dataset becomes an imperative sub-task. Investigation of state-of-the-art approaches reveals that there is a need for a systematic study of class imbalance handling strategies to design an intelligent and capable system to detect the fraudulent transaction. This work aims to provide a comparative study of different class imbalance handling methods. To compare the effectiveness and efficiency of different class imbalance approaches in conjunction with state-of-the-art classification approaches, we have performed an extensive experimental study. We compared these methods on many performance indicators such as Precision, Recall, K-fold Cross-validation, AUC-ROC curve and execution time. In this study, we found that the Oversampling followed by Undersampling methods performs well for ensemble classification models such as AdaBoost, XGBoost and Random Forest.\",\"PeriodicalId\":15677,\"journal\":{\"name\":\"Journal of Experimental & Theoretical Artificial Intelligence\",\"volume\":\"1 1\",\"pages\":\"571 - 598\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2021-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Experimental & Theoretical Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1080/0952813X.2021.1907795\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Experimental & Theoretical Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1080/0952813X.2021.1907795","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Credit Card Fraud Detection under Extreme Imbalanced Data: A Comparative Study of Data-level Algorithms
ABSTRACT Credit card fraud is one of the biggest cybercrimes faced by users. Intelligent machine learning based fraudulent transaction detection systems are very effective in real-world scenarios. However, while designing these systems, machine learning approaches suffer from the problem of imbalanced data, i.e. imbalanced class distribution. Therefore, balancing the dataset becomes an imperative sub-task. Investigation of state-of-the-art approaches reveals that there is a need for a systematic study of class imbalance handling strategies to design an intelligent and capable system to detect the fraudulent transaction. This work aims to provide a comparative study of different class imbalance handling methods. To compare the effectiveness and efficiency of different class imbalance approaches in conjunction with state-of-the-art classification approaches, we have performed an extensive experimental study. We compared these methods on many performance indicators such as Precision, Recall, K-fold Cross-validation, AUC-ROC curve and execution time. In this study, we found that the Oversampling followed by Undersampling methods performs well for ensemble classification models such as AdaBoost, XGBoost and Random Forest.
期刊介绍:
Journal of Experimental & Theoretical Artificial Intelligence (JETAI) is a world leading journal dedicated to publishing high quality, rigorously reviewed, original papers in artificial intelligence (AI) research.
The journal features work in all subfields of AI research and accepts both theoretical and applied research. Topics covered include, but are not limited to, the following:
• cognitive science
• games
• learning
• knowledge representation
• memory and neural system modelling
• perception
• problem-solving