A Comparison of Missing Value Imputation Techniques on Coupon Acceptance Prediction

International Journal of Information Technology and Computer Science Pub Date : 2022-10-08 DOI:10.5815/ijitcs.2022.05.02

Rahin Atiq, Farzana Fariha, Mutasim Mahmud, Sadman S. Yeamin, K. I. Rushee, Shamsur Rahim

{"title":"A Comparison of Missing Value Imputation Techniques on Coupon Acceptance Prediction","authors":"Rahin Atiq, Farzana Fariha, Mutasim Mahmud, Sadman S. Yeamin, K. I. Rushee, Shamsur Rahim","doi":"10.5815/ijitcs.2022.05.02","DOIUrl":null,"url":null,"abstract":"The In-Vehicle Coupon Recommendation System is a type of coupon used to represent an idea of different driving scenarios to users. Basically, with the help of presenting the scenarios, the people’s opinion is taken on whether they will accept the coupon or not. The coupons offered in the survey were for Bar, Coffee Shop, Restaurants, and Take Away. The dataset consists of various attributes that capture precise information about the clients to give a coupon recommendation. The dataset is significant to shops to determine whether the coupons they offer are benefi-cial or not, depending on the different characteristics and scenarios of the users. A major problem with this dataset was that the dataset was imbalanced and mixed with missing values. Handling the missing values and imbalanced class problems could affect the prediction results. In the paper, we analysed the impact of four different imputation techniques (Frequent value, mean, KNN, MICE) to replace the missing values and use them to create prediction mod-els. As for models, we applied six classifier algorithms (Naive Bayes, Deep Learning, Logistic Regression, Decision Tree, Random Forest, and Gradient Boosted Tree). This paper aims to analyse the impact of the imputation techniques on the dataset alongside the outcomes of the classifiers to find the most accurate model among them. So that shops or stores that offer coupons or vouchers would get a real idea about their target customers. From our research, we found out that KNN imputation with Deep Learning classifier gave the most accurate outcome for prediction and false-negative rate.","PeriodicalId":130361,"journal":{"name":"International Journal of Information Technology and Computer Science","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Technology and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijitcs.2022.05.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The In-Vehicle Coupon Recommendation System is a type of coupon used to represent an idea of different driving scenarios to users. Basically, with the help of presenting the scenarios, the people’s opinion is taken on whether they will accept the coupon or not. The coupons offered in the survey were for Bar, Coffee Shop, Restaurants, and Take Away. The dataset consists of various attributes that capture precise information about the clients to give a coupon recommendation. The dataset is significant to shops to determine whether the coupons they offer are benefi-cial or not, depending on the different characteristics and scenarios of the users. A major problem with this dataset was that the dataset was imbalanced and mixed with missing values. Handling the missing values and imbalanced class problems could affect the prediction results. In the paper, we analysed the impact of four different imputation techniques (Frequent value, mean, KNN, MICE) to replace the missing values and use them to create prediction mod-els. As for models, we applied six classifier algorithms (Naive Bayes, Deep Learning, Logistic Regression, Decision Tree, Random Forest, and Gradient Boosted Tree). This paper aims to analyse the impact of the imputation techniques on the dataset alongside the outcomes of the classifiers to find the most accurate model among them. So that shops or stores that offer coupons or vouchers would get a real idea about their target customers. From our research, we found out that KNN imputation with Deep Learning classifier gave the most accurate outcome for prediction and false-negative rate.

查看原文本刊更多论文

优惠券接受预测中缺失值估算方法的比较

车载优惠券推荐系统是一种优惠券类型，用于向用户表示不同驾驶场景的想法。基本上，在场景的帮助下，人们的意见是关于他们是否会接受优惠券。调查中提供的优惠券包括酒吧、咖啡店、餐馆和外卖。数据集由各种属性组成，这些属性捕获有关客户的精确信息，以提供优惠券推荐。该数据集对于商店来说非常重要，可以根据用户的不同特征和场景来确定他们提供的优惠券是否有益。这个数据集的一个主要问题是数据集不平衡，并且混合了缺失的值。缺失值和类不平衡问题的处理会影响预测结果。在本文中，我们分析了四种不同的imputation技术(frequency value, mean, KNN, MICE)对替换缺失值的影响，并使用它们来创建预测模型。至于模型，我们应用了六种分类器算法(朴素贝叶斯，深度学习，逻辑回归，决策树，随机森林和梯度提升树)。本文旨在分析各种归算技术对数据集的影响以及分类器的结果，以找到其中最准确的模型。这样，提供优惠券或代金券的商店就能真正了解他们的目标客户。通过研究，我们发现使用深度学习分类器的KNN imputation给出了最准确的预测结果和假阴性率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Information Technology and Computer Science

自引率

0.00%

发文量