{"title":"Utilizing Prediction Intervals for Unsupervised Detection of Fraudulent Transactions: A Case Study","authors":"I. Hewapathirana","doi":"10.51983/ajeat-2022.11.2.3348","DOIUrl":null,"url":null,"abstract":"Money laundering operations have a high negative impact on the growth of a country’s national economy. As all financial sectors are increasingly being integrated, it is vital to implement effective technological measures to address these fraudulent operations. Machine learning methods are widely used to classify an incoming transaction as fraudulent or non-fraudulent by analyzing the behaviour of past transactions. Unsupervised machine learning methods do not require label information on past transactions, and a classification is made solely based on the distribution of the transaction. This research presents three unsupervised classification methods: ordinary least squares regression-based (OLS) fraud detection, random forest-based (RF) fraud detection and dropout neural network-based (DNN) fraud detection. For each method, the goal is to classify an incoming transaction amount as fraudulent or non-fraudulent. The novelty in the proposed approach is the application of prediction interval calculation for automatically validating incoming transactions. The three methods are applied to a real-world dataset of credit card transactions. The fraud labels available for the dataset are removed during the model training phase but are later used to evaluate the performance of the final predictions. The performance of the proposed methods is further compared with two other unsupervised state-of-the-art methods. Based on the experimental results, the OLS and RF methods show the best performance in predicting the correct label of a transaction, while the DNN method is the most robust method for detecting fraudulent transactions. This novel concept of calculating prediction intervals for validating an incoming transaction introduces a new direction for unsupervised fraud detection. Since fraud labels on past transactions are not required for training, the proposed methods can be applied in an online setting to different areas, such as detecting money laundering activities, telecommunication fraud and intrusion detection.","PeriodicalId":8524,"journal":{"name":"Asian Journal of Engineering and Applied Technology","volume":"81 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asian Journal of Engineering and Applied Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51983/ajeat-2022.11.2.3348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Money laundering operations have a high negative impact on the growth of a country’s national economy. As all financial sectors are increasingly being integrated, it is vital to implement effective technological measures to address these fraudulent operations. Machine learning methods are widely used to classify an incoming transaction as fraudulent or non-fraudulent by analyzing the behaviour of past transactions. Unsupervised machine learning methods do not require label information on past transactions, and a classification is made solely based on the distribution of the transaction. This research presents three unsupervised classification methods: ordinary least squares regression-based (OLS) fraud detection, random forest-based (RF) fraud detection and dropout neural network-based (DNN) fraud detection. For each method, the goal is to classify an incoming transaction amount as fraudulent or non-fraudulent. The novelty in the proposed approach is the application of prediction interval calculation for automatically validating incoming transactions. The three methods are applied to a real-world dataset of credit card transactions. The fraud labels available for the dataset are removed during the model training phase but are later used to evaluate the performance of the final predictions. The performance of the proposed methods is further compared with two other unsupervised state-of-the-art methods. Based on the experimental results, the OLS and RF methods show the best performance in predicting the correct label of a transaction, while the DNN method is the most robust method for detecting fraudulent transactions. This novel concept of calculating prediction intervals for validating an incoming transaction introduces a new direction for unsupervised fraud detection. Since fraud labels on past transactions are not required for training, the proposed methods can be applied in an online setting to different areas, such as detecting money laundering activities, telecommunication fraud and intrusion detection.