{"title":"利用不同过采样技术改进非平衡类分布中的欺诈检测","authors":"R. Qaddoura, Mariam M. Biltawi","doi":"10.1109/EICEEAI56378.2022.10050500","DOIUrl":null,"url":null,"abstract":"Credit card fraud detection is essential for financial institutions to avoid charging customers for items they did not purchase. Fraud detection can be implemented through ML by building a model trained on a dataset containing transactions with fraud and non-fraud classes. The dataset available for this task is usually highly imbalanced. Therefore, the goal of this paper is to conduct a comprehensive comparison between five oversampling techniques. The oversampling techniques are the Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), borderline1 SMOTE, borderline2 SMOTE, and Support Vector Machine SMOTE (SVM SMOTE) to generate an enhanced model which can solve the imbalanced problem. The comparison is conducted by computing the geometric mean, recall, precision, and F1-score of six machine learning models with and without applying oversampling. The ML models experimented with are logistic regression, random forest, K-nearest neighbor, naive Bayes, support vector machine, and decision tree. Experimental results show that the oversampling techniques have improved the models' performance.","PeriodicalId":426838,"journal":{"name":"2022 International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI)","volume":"240 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Fraud Detection in An Imbalanced Class Distribution Using Different Oversampling Techniques\",\"authors\":\"R. Qaddoura, Mariam M. Biltawi\",\"doi\":\"10.1109/EICEEAI56378.2022.10050500\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Credit card fraud detection is essential for financial institutions to avoid charging customers for items they did not purchase. Fraud detection can be implemented through ML by building a model trained on a dataset containing transactions with fraud and non-fraud classes. The dataset available for this task is usually highly imbalanced. Therefore, the goal of this paper is to conduct a comprehensive comparison between five oversampling techniques. The oversampling techniques are the Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), borderline1 SMOTE, borderline2 SMOTE, and Support Vector Machine SMOTE (SVM SMOTE) to generate an enhanced model which can solve the imbalanced problem. The comparison is conducted by computing the geometric mean, recall, precision, and F1-score of six machine learning models with and without applying oversampling. The ML models experimented with are logistic regression, random forest, K-nearest neighbor, naive Bayes, support vector machine, and decision tree. Experimental results show that the oversampling techniques have improved the models' performance.\",\"PeriodicalId\":426838,\"journal\":{\"name\":\"2022 International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI)\",\"volume\":\"240 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EICEEAI56378.2022.10050500\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EICEEAI56378.2022.10050500","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Fraud Detection in An Imbalanced Class Distribution Using Different Oversampling Techniques
Credit card fraud detection is essential for financial institutions to avoid charging customers for items they did not purchase. Fraud detection can be implemented through ML by building a model trained on a dataset containing transactions with fraud and non-fraud classes. The dataset available for this task is usually highly imbalanced. Therefore, the goal of this paper is to conduct a comprehensive comparison between five oversampling techniques. The oversampling techniques are the Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), borderline1 SMOTE, borderline2 SMOTE, and Support Vector Machine SMOTE (SVM SMOTE) to generate an enhanced model which can solve the imbalanced problem. The comparison is conducted by computing the geometric mean, recall, precision, and F1-score of six machine learning models with and without applying oversampling. The ML models experimented with are logistic regression, random forest, K-nearest neighbor, naive Bayes, support vector machine, and decision tree. Experimental results show that the oversampling techniques have improved the models' performance.