Fujiama Diapoldo Silalahi, Toni Wijanarko Adi Putra, Edy Siswanto
{"title":"信用卡诈骗检测的机器学习技术","authors":"Fujiama Diapoldo Silalahi, Toni Wijanarko Adi Putra, Edy Siswanto","doi":"10.51903/jtie.v1i1.143","DOIUrl":null,"url":null,"abstract":"Credit Card (CC) scam In financial markets is a growing nuisance. CC scams increasing rapidly and causing large amounts of financial losses for organizations, governments, and public institutions, especially now that all payment methods for e-commerce shopping can be done much more easily through digital payment methods. For this reason, the purpose of this study is to detect scam CC transactions from a given dataset by performing a predictive investigation on the CC transaction dataset using machine learning techniques. The method used is a predictive model approach, namely logistic regression models (LR-M), random forests (RF), and XGBoost combined along particular resampling techniques that have been practiced to anticipate scams and the authenticity of CC transactions. Model performance was calculated grounded Re-call Curve (RC), precision, f1-score, PR, and ROC. \nThe experimental results show that the random forest in combination with the hybrid resampling approach of SMOTE and removal of Tomek Links works better than other models. The random forest model and XGBoost accomplished are preferred over the LR-M as long as their global f1 score is without re-sampling. This demonstrates the strength of one technique that can provide greater achievement alike in the existence of class inequality dilemmas. Each approach, at the same time when used with Ran-Under, will give a great memory score but fails cursedly in the language of accuracy. Compared to the coordinate model sine re-sampling, the accuracy and RS are not repaired in cases where Tomek linker displacement was used. RF and xgboost perform quite well in terms of f1-S when Ran-Over is used. SMOTE increases the random forest draw score and xgboost but the precision score (PS) decreases slightly. \nCompletely, during a hybrid solution of Tomek delinker and SMOTE was practiced with random forest, it gave equitable attention and RS in the PR-AUC. XGboost failed to increase the PS even though the same re-sampling technique was used. For future research, a fee-delicate study method can be applied as long as fee misclassifications. So for future research, it is very necessary to consider this behavior change and it is also very important to develop predictive models. In addition to this, much larger data is needed so that detailed studies on handling non-stationary properties in CC scam detection can be carried out better.","PeriodicalId":177576,"journal":{"name":"Journal of Technology Informatics and Engineering","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MACHINE LEARNING TECHNIQUE FOR CREDIT CARD SCAM DETECTION\",\"authors\":\"Fujiama Diapoldo Silalahi, Toni Wijanarko Adi Putra, Edy Siswanto\",\"doi\":\"10.51903/jtie.v1i1.143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Credit Card (CC) scam In financial markets is a growing nuisance. CC scams increasing rapidly and causing large amounts of financial losses for organizations, governments, and public institutions, especially now that all payment methods for e-commerce shopping can be done much more easily through digital payment methods. For this reason, the purpose of this study is to detect scam CC transactions from a given dataset by performing a predictive investigation on the CC transaction dataset using machine learning techniques. The method used is a predictive model approach, namely logistic regression models (LR-M), random forests (RF), and XGBoost combined along particular resampling techniques that have been practiced to anticipate scams and the authenticity of CC transactions. Model performance was calculated grounded Re-call Curve (RC), precision, f1-score, PR, and ROC. \\nThe experimental results show that the random forest in combination with the hybrid resampling approach of SMOTE and removal of Tomek Links works better than other models. The random forest model and XGBoost accomplished are preferred over the LR-M as long as their global f1 score is without re-sampling. This demonstrates the strength of one technique that can provide greater achievement alike in the existence of class inequality dilemmas. Each approach, at the same time when used with Ran-Under, will give a great memory score but fails cursedly in the language of accuracy. Compared to the coordinate model sine re-sampling, the accuracy and RS are not repaired in cases where Tomek linker displacement was used. RF and xgboost perform quite well in terms of f1-S when Ran-Over is used. SMOTE increases the random forest draw score and xgboost but the precision score (PS) decreases slightly. \\nCompletely, during a hybrid solution of Tomek delinker and SMOTE was practiced with random forest, it gave equitable attention and RS in the PR-AUC. XGboost failed to increase the PS even though the same re-sampling technique was used. For future research, a fee-delicate study method can be applied as long as fee misclassifications. So for future research, it is very necessary to consider this behavior change and it is also very important to develop predictive models. In addition to this, much larger data is needed so that detailed studies on handling non-stationary properties in CC scam detection can be carried out better.\",\"PeriodicalId\":177576,\"journal\":{\"name\":\"Journal of Technology Informatics and Engineering\",\"volume\":\"89 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Technology Informatics and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.51903/jtie.v1i1.143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Technology Informatics and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51903/jtie.v1i1.143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MACHINE LEARNING TECHNIQUE FOR CREDIT CARD SCAM DETECTION
Credit Card (CC) scam In financial markets is a growing nuisance. CC scams increasing rapidly and causing large amounts of financial losses for organizations, governments, and public institutions, especially now that all payment methods for e-commerce shopping can be done much more easily through digital payment methods. For this reason, the purpose of this study is to detect scam CC transactions from a given dataset by performing a predictive investigation on the CC transaction dataset using machine learning techniques. The method used is a predictive model approach, namely logistic regression models (LR-M), random forests (RF), and XGBoost combined along particular resampling techniques that have been practiced to anticipate scams and the authenticity of CC transactions. Model performance was calculated grounded Re-call Curve (RC), precision, f1-score, PR, and ROC.
The experimental results show that the random forest in combination with the hybrid resampling approach of SMOTE and removal of Tomek Links works better than other models. The random forest model and XGBoost accomplished are preferred over the LR-M as long as their global f1 score is without re-sampling. This demonstrates the strength of one technique that can provide greater achievement alike in the existence of class inequality dilemmas. Each approach, at the same time when used with Ran-Under, will give a great memory score but fails cursedly in the language of accuracy. Compared to the coordinate model sine re-sampling, the accuracy and RS are not repaired in cases where Tomek linker displacement was used. RF and xgboost perform quite well in terms of f1-S when Ran-Over is used. SMOTE increases the random forest draw score and xgboost but the precision score (PS) decreases slightly.
Completely, during a hybrid solution of Tomek delinker and SMOTE was practiced with random forest, it gave equitable attention and RS in the PR-AUC. XGboost failed to increase the PS even though the same re-sampling technique was used. For future research, a fee-delicate study method can be applied as long as fee misclassifications. So for future research, it is very necessary to consider this behavior change and it is also very important to develop predictive models. In addition to this, much larger data is needed so that detailed studies on handling non-stationary properties in CC scam detection can be carried out better.