{"title":"Predicting Job Change among Data Scientists using Machine Learning Technique","authors":"Felicisima V. Rafael","doi":"10.35609/gcbssproceeding.2022.2(77)","DOIUrl":null,"url":null,"abstract":"In the knowledge and data-driven economy, countless ramifications were attributed to great contribution of data scientists in transforming business and industries by using various data science tools in recognizing and generating patterns in data points to generate insights. The study aimed at applying data science in human resources, and generates actionable intelligence, and HR analytics to better understand employees' perception towards the company, work environment. The researcher used the processes of Knowledge Discovery in Databases (KDD) method. Knowledge discovery in databases is the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns or relationships within a dataset (10,000 examples, 0 special attributes, and 14 regular attributes) to make important decisions. RapidMiner was used perform the KDD processes of selecting, pre-processing, data transformation, data mining using machine learning algorithm. Accordingly, Decision Tree was found to be the learning algorithm fit for the ExampleSet. Further, among 14 attributes, the most important attribute to split on was the city_development_index. This implies that the best predictor variable for job change among data scientists was the city_development_index. Consequently, the prediction model has 92.1% confidence that a Male who works in a city with a development index of 0.920, with relevant data science experience, not presently enrolled in the university, high school graduate, with 5 years of work experience, presently working in a Funded Start-up company with 50-99 employees, works for the first time with training hours=24 was predicted will \"Not Change\" a job. The model has 77.78% accuracy, and 81.70% precision.\n\n\nKeywords: Data Scientist, Data Science, Job Change, Human Resource Analytics","PeriodicalId":113523,"journal":{"name":"14th GCBSS Proceeding 2022","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"14th GCBSS Proceeding 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35609/gcbssproceeding.2022.2(77)","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the knowledge and data-driven economy, countless ramifications were attributed to great contribution of data scientists in transforming business and industries by using various data science tools in recognizing and generating patterns in data points to generate insights. The study aimed at applying data science in human resources, and generates actionable intelligence, and HR analytics to better understand employees' perception towards the company, work environment. The researcher used the processes of Knowledge Discovery in Databases (KDD) method. Knowledge discovery in databases is the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns or relationships within a dataset (10,000 examples, 0 special attributes, and 14 regular attributes) to make important decisions. RapidMiner was used perform the KDD processes of selecting, pre-processing, data transformation, data mining using machine learning algorithm. Accordingly, Decision Tree was found to be the learning algorithm fit for the ExampleSet. Further, among 14 attributes, the most important attribute to split on was the city_development_index. This implies that the best predictor variable for job change among data scientists was the city_development_index. Consequently, the prediction model has 92.1% confidence that a Male who works in a city with a development index of 0.920, with relevant data science experience, not presently enrolled in the university, high school graduate, with 5 years of work experience, presently working in a Funded Start-up company with 50-99 employees, works for the first time with training hours=24 was predicted will "Not Change" a job. The model has 77.78% accuracy, and 81.70% precision.
Keywords: Data Scientist, Data Science, Job Change, Human Resource Analytics
在知识和数据驱动的经济中,数据科学家通过使用各种数据科学工具来识别和生成数据点中的模式以产生见解,从而在改变商业和行业方面做出了巨大贡献。该研究旨在将数据科学应用于人力资源,并生成可操作的情报和人力资源分析,以更好地了解员工对公司和工作环境的看法。研究人员采用了KDD (Knowledge Discovery in Databases)方法。数据库中的知识发现是识别数据集(10,000个示例、0个特殊属性和14个常规属性)中有效的、新颖的、可能有用的和最终可理解的模式或关系以做出重要决策的重要过程。RapidMiner使用机器学习算法完成KDD过程的选择、预处理、数据转换、数据挖掘。因此,决策树被认为是适合于ExampleSet的学习算法。此外,在14个属性中,最重要的属性是city_development_index。这意味着数据科学家工作变动的最佳预测变量是城市发展指数。因此,预测模型有92.1%的置信度,在发展指数为0.920的城市工作的男性,具有相关的数据科学经验,目前没有上大学,高中毕业,有5年的工作经验,目前在资助的创业公司工作,员工人数为50-99人,第一次工作,培训时间=24小时,被预测为“不会改变”工作。模型的准确率为77.78%,精密度为81.70%。关键词:数据科学家,数据科学,工作变化,人力资源分析