Jia Yi Vivian Quek, Ying Han Pang, Zheng You Lim, Shih Yin Ooi, Wee How Khoh
{"title":"基于属性选择分析和支持向量机的客户流失预测","authors":"Jia Yi Vivian Quek, Ying Han Pang, Zheng You Lim, Shih Yin Ooi, Wee How Khoh","doi":"10.18080/jtde.v11n3.777","DOIUrl":null,"url":null,"abstract":"An accurate customer churn prediction could alert businesses about potential churn customers so that proactive actions can be taken to retain the customers. Predicting churn may not be easy, especially with the increasing database sample size. Hence, attribute selection is vital in machine learning to comprehend complex attributes and identify essential variables. In this paper, a customer churn prediction model is proposed based on attribute selection analysis and Support Vector Machine. The proposed model improves churn prediction performance with reduced feature dimensions by identifying the most significant attributes of customer data. Firstly, exploratory data analysis and data preprocessing are performed to understand the data and preprocess it to improve the data quality. Next, two filter-based attribute selection techniques, i.e., Chi-squared and Analysis of Variance (ANOVA), are applied to the pre-processed data to select relevant features. Then, the selected features are input into a Support Vector Machine for classification. A real-world telecom database is used for model assessment. The empirical results demonstrate that ANOVA outperforms the Chi-squared filter in attribute selection. Furthermore, the results also show that, with merely ~50% of the features, feature selection based on ANOVA exhibits better performance compared to full feature set utilization.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Customer Churn Prediction through Attribute Selection Analysis and Support Vector Machine\",\"authors\":\"Jia Yi Vivian Quek, Ying Han Pang, Zheng You Lim, Shih Yin Ooi, Wee How Khoh\",\"doi\":\"10.18080/jtde.v11n3.777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An accurate customer churn prediction could alert businesses about potential churn customers so that proactive actions can be taken to retain the customers. Predicting churn may not be easy, especially with the increasing database sample size. Hence, attribute selection is vital in machine learning to comprehend complex attributes and identify essential variables. In this paper, a customer churn prediction model is proposed based on attribute selection analysis and Support Vector Machine. The proposed model improves churn prediction performance with reduced feature dimensions by identifying the most significant attributes of customer data. Firstly, exploratory data analysis and data preprocessing are performed to understand the data and preprocess it to improve the data quality. Next, two filter-based attribute selection techniques, i.e., Chi-squared and Analysis of Variance (ANOVA), are applied to the pre-processed data to select relevant features. Then, the selected features are input into a Support Vector Machine for classification. A real-world telecom database is used for model assessment. The empirical results demonstrate that ANOVA outperforms the Chi-squared filter in attribute selection. Furthermore, the results also show that, with merely ~50% of the features, feature selection based on ANOVA exhibits better performance compared to full feature set utilization.\",\"PeriodicalId\":37752,\"journal\":{\"name\":\"Australian Journal of Telecommunications and the Digital Economy\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Australian Journal of Telecommunications and the Digital Economy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18080/jtde.v11n3.777\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australian Journal of Telecommunications and the Digital Economy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18080/jtde.v11n3.777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
Customer Churn Prediction through Attribute Selection Analysis and Support Vector Machine
An accurate customer churn prediction could alert businesses about potential churn customers so that proactive actions can be taken to retain the customers. Predicting churn may not be easy, especially with the increasing database sample size. Hence, attribute selection is vital in machine learning to comprehend complex attributes and identify essential variables. In this paper, a customer churn prediction model is proposed based on attribute selection analysis and Support Vector Machine. The proposed model improves churn prediction performance with reduced feature dimensions by identifying the most significant attributes of customer data. Firstly, exploratory data analysis and data preprocessing are performed to understand the data and preprocess it to improve the data quality. Next, two filter-based attribute selection techniques, i.e., Chi-squared and Analysis of Variance (ANOVA), are applied to the pre-processed data to select relevant features. Then, the selected features are input into a Support Vector Machine for classification. A real-world telecom database is used for model assessment. The empirical results demonstrate that ANOVA outperforms the Chi-squared filter in attribute selection. Furthermore, the results also show that, with merely ~50% of the features, feature selection based on ANOVA exhibits better performance compared to full feature set utilization.
期刊介绍:
The Journal of Telecommunications and the Digital Economy (JTDE) is an international, open-access, high quality, peer reviewed journal, indexed by Scopus and Google Scholar, covering innovative research and practice in Telecommunications, Digital Economy and Applications. The mission of JTDE is to further through publication the objective of advancing learning, knowledge and research worldwide. The JTDE publishes peer reviewed papers that may take the following form: *Research Paper - a paper making an original contribution to engineering knowledge. *Special Interest Paper – a report on significant aspects of a major or notable project. *Review Paper for specialists – an overview of a relevant area intended for specialists in the field covered. *Review Paper for non-specialists – an overview of a relevant area suitable for a reader with an electrical/electronics background. *Public Policy Discussion - a paper that identifies or discusses public policy and includes investigation of legislation, regulation and what is happening around the world including best practice *Tutorial Paper – a paper that explains an important subject or clarifies the approach to an area of design or investigation. *Technical Note – a technical note or letter to the Editors that is not sufficiently developed or extensive in scope to constitute a full paper. *Industry Case Study - a paper that provides details of industry practices utilising a case study to provide an understanding of what is occurring and how the outcomes have been achieved. *Discussion – a contribution to discuss a published paper to which the original author''s response will be sought. Historical - a paper covering a historical topic related to telecommunications or the digital economy.