{"title":"Machine Learning Pipeline for Fraud Detection and Prevention in E-Commerce Transactions","authors":"Resham Jhangiani, Doina Bein, Abhishek Verma","doi":"10.1109/UEMCON47517.2019.8992993","DOIUrl":null,"url":null,"abstract":"Fraud has become a major problem in e-commerce and a lot of resources are being invested to recognize and prevent it. Present fraud detection and prevention systems are designed to prevent only a small fraction of fraudulent transactions processed, which still costs billions of dollars in loss. There is an urgent need for better fraud detection and prevention as the online transactions are estimated to increase substantially in the coming year. We propose a data driven model using machine learning algorithms on big data to predict the probability of a transaction being fraudulent or legitimate. The model was trained on historical e-commerce credit card transaction data to predict the probability of any future transaction by the customer being fraudulent. Supervised machine learning algorithms like Random Forest, Support Vector Machine, Gradient Boost and combinations of these are implemented and their performance are compared. While at the same time the problem of class imbalance is taken into consideration and techniques of oversampling and data pre-processing are performed before the model is trained on a classifier.","PeriodicalId":187022,"journal":{"name":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UEMCON47517.2019.8992993","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Fraud has become a major problem in e-commerce and a lot of resources are being invested to recognize and prevent it. Present fraud detection and prevention systems are designed to prevent only a small fraction of fraudulent transactions processed, which still costs billions of dollars in loss. There is an urgent need for better fraud detection and prevention as the online transactions are estimated to increase substantially in the coming year. We propose a data driven model using machine learning algorithms on big data to predict the probability of a transaction being fraudulent or legitimate. The model was trained on historical e-commerce credit card transaction data to predict the probability of any future transaction by the customer being fraudulent. Supervised machine learning algorithms like Random Forest, Support Vector Machine, Gradient Boost and combinations of these are implemented and their performance are compared. While at the same time the problem of class imbalance is taken into consideration and techniques of oversampling and data pre-processing are performed before the model is trained on a classifier.