Predicting Employee Attrition using Machine Learning

2018 International Conference on Innovations in Information Technology (IIT) Pub Date : 2018-11-01 DOI:10.1109/INNOVATIONS.2018.8605976

Sarah S. Alduayj, K. Rajpoot

{"title":"Predicting Employee Attrition using Machine Learning","authors":"Sarah S. Alduayj, K. Rajpoot","doi":"10.1109/INNOVATIONS.2018.8605976","DOIUrl":null,"url":null,"abstract":"The growing interest in machine learning among business leaders and decision makers demands that researchers explore its use within business organisations. One of the major issues facing business leaders within companies is the loss of talented employees. This research studies employee attrition using machine learning models. Using a synthetic data created by IBM Watson, three main experiments were conducted to predict employee attrition. The first experiment involved training the original class-imbalanced dataset with the following machine learning models: support victor machine (SVM) with several kernel functions, random forest and K-nearest neighbour (KNN). The second experiment focused on using adaptive synthetic (ADASYN) approach to overcome class imbalance, then retraining on the new dataset using the abovementioned machine learning models. The third experiment involved using manual undersampling of the data to balance between classes. As a result, training an ADASYN-balanced dataset with KNN (K = 3) achieved the highest performance, with 0.93 F1-score. Finally, by using feature selection and random forest, F1-score of 0.909 was achieved using 12 features out of a total of 29 features.","PeriodicalId":319472,"journal":{"name":"2018 International Conference on Innovations in Information Technology (IIT)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Innovations in Information Technology (IIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INNOVATIONS.2018.8605976","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 56

Abstract

The growing interest in machine learning among business leaders and decision makers demands that researchers explore its use within business organisations. One of the major issues facing business leaders within companies is the loss of talented employees. This research studies employee attrition using machine learning models. Using a synthetic data created by IBM Watson, three main experiments were conducted to predict employee attrition. The first experiment involved training the original class-imbalanced dataset with the following machine learning models: support victor machine (SVM) with several kernel functions, random forest and K-nearest neighbour (KNN). The second experiment focused on using adaptive synthetic (ADASYN) approach to overcome class imbalance, then retraining on the new dataset using the abovementioned machine learning models. The third experiment involved using manual undersampling of the data to balance between classes. As a result, training an ADASYN-balanced dataset with KNN (K = 3) achieved the highest performance, with 0.93 F1-score. Finally, by using feature selection and random forest, F1-score of 0.909 was achieved using 12 features out of a total of 29 features.

查看原文本刊更多论文

使用机器学习预测员工流失

商业领袖和决策者对机器学习的兴趣日益浓厚，这要求研究人员探索机器学习在商业组织中的应用。企业领导者面临的主要问题之一是优秀员工的流失。本研究使用机器学习模型研究员工流失。使用IBM Watson创建的合成数据，进行了三个主要实验来预测员工流失。第一个实验涉及使用以下机器学习模型训练原始的类不平衡数据集:具有多个核函数的支持胜利者机(SVM)、随机森林和k近邻(KNN)。第二个实验侧重于使用自适应合成(ADASYN)方法来克服类不平衡，然后使用上述机器学习模型对新数据集进行再训练。第三个实验涉及使用人工欠采样数据来平衡类别。因此，训练一个具有KNN (K = 3)的adasync -balanced数据集达到了最高的性能，f1得分为0.93。最后，利用特征选择和随机森林的方法，从29个特征中选取12个特征，得到f1得分0.909。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 International Conference on Innovations in Information Technology (IIT)

自引率

0.00%

发文量