Customer churn prediction for telecommunication: Employing various various features selection techniques and tree based ensemble classifiers

Adnan Idris, Asifullah Khan
{"title":"Customer churn prediction for telecommunication: Employing various various features selection techniques and tree based ensemble classifiers","authors":"Adnan Idris, Asifullah Khan","doi":"10.1109/INMIC.2012.6511498","DOIUrl":null,"url":null,"abstract":"Ensemble classifiers have received increasing attention for attaining the higher classification performance in recent times. In this paper, we present comparative performances of various tree based ensemble classifiers in collaboration with maximum relevancy and minimum redundancy (mRMR), Fisher's ratio and F-score based features selection schemes for a challenging problem of churn prediction in telecommunication. The large sized telecommunication dataset has been the main hurdle in achieving the desired classification performance in the contemporary proposed churn prediction models. Though, tree based ensemble classifiers are considered suitable for larger datasets, but we have found rotation forest and rotboost as effective techniques compared to random forest, which employ boosting through features selection and increased diversity by incorporating linear feature extraction method such as Principal Component Analysis. In addition to the features selection performed by used ensembles, we have also incorporated mRMR, Fisher's ratio and F-score techniques for features selection. mRMR returns a coherent and well discriminants feature set, compared to Fisher's ratio and F-score, which significantly reduces the computations and helps classifier in attaining improved performance. The performance evaluation is conducted using area under curve, sensitivity and specificity where Rotboost, an ensemble of rotation forest and Adaboost in collaboration with mRMR has shown competitive results for churn prediction in telecommunication as compared to other ensemble methods.","PeriodicalId":396084,"journal":{"name":"2012 15th International Multitopic Conference (INMIC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 15th International Multitopic Conference (INMIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INMIC.2012.6511498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

Ensemble classifiers have received increasing attention for attaining the higher classification performance in recent times. In this paper, we present comparative performances of various tree based ensemble classifiers in collaboration with maximum relevancy and minimum redundancy (mRMR), Fisher's ratio and F-score based features selection schemes for a challenging problem of churn prediction in telecommunication. The large sized telecommunication dataset has been the main hurdle in achieving the desired classification performance in the contemporary proposed churn prediction models. Though, tree based ensemble classifiers are considered suitable for larger datasets, but we have found rotation forest and rotboost as effective techniques compared to random forest, which employ boosting through features selection and increased diversity by incorporating linear feature extraction method such as Principal Component Analysis. In addition to the features selection performed by used ensembles, we have also incorporated mRMR, Fisher's ratio and F-score techniques for features selection. mRMR returns a coherent and well discriminants feature set, compared to Fisher's ratio and F-score, which significantly reduces the computations and helps classifier in attaining improved performance. The performance evaluation is conducted using area under curve, sensitivity and specificity where Rotboost, an ensemble of rotation forest and Adaboost in collaboration with mRMR has shown competitive results for churn prediction in telecommunication as compared to other ensemble methods.
电信客户流失预测:采用各种特征选择技术和基于树的集成分类器
近年来,集成分类器因其具有较高的分类性能而受到越来越多的关注。在本文中,我们比较了各种基于树的集成分类器与最大相关性和最小冗余(mRMR), Fisher比率和基于f分数的特征选择方案的性能,以解决电信客户流失预测的挑战性问题。在当前提出的客户流失预测模型中,大型电信数据集一直是实现理想分类性能的主要障碍。虽然,基于树的集成分类器被认为适用于更大的数据集,但我们发现与随机森林相比,旋转森林和rotboost是有效的技术,它们通过特征选择和结合线性特征提取方法(如主成分分析)来增加多样性。除了使用合奏进行特征选择外,我们还结合了mRMR, Fisher比率和F-score技术进行特征选择。与Fisher的比率和F-score相比,mRMR返回一个连贯且判别性良好的特征集,这大大减少了计算量,并有助于分类器获得更好的性能。性能评估使用曲线下面积、灵敏度和特异性进行,其中Rotboost、轮换森林和Adaboost与mRMR合作的集成方法在电信客户流失预测方面显示出与其他集成方法相比具有竞争力的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信