Sofía Ramos-Pulido, Neil Hernández-Gress, Gabriela Torres-Delgado
{"title":"利用数据科学模型探索职业满意度与大学学习之间的关系","authors":"Sofía Ramos-Pulido, Neil Hernández-Gress, Gabriela Torres-Delgado","doi":"10.3390/informatics11010006","DOIUrl":null,"url":null,"abstract":"Current research on the career satisfaction of graduates limits educational institutions in devising methods to attain high career satisfaction. Thus, this study aims to use data science models to understand and predict career satisfaction based on information collected from surveys of university alumni. Five machine learning (ML) algorithms were used for data analysis, including the decision tree, random forest, gradient boosting, support vector machine, and neural network models. To achieve optimal prediction performance, we utilized the Bayesian optimization method to fine-tune the parameters of the five ML algorithms. The five ML models were compared with logistic and ordinal regression. Then, to extract the most important features of the best predictive model, we employed the SHapley Additive exPlanations (SHAP), a novel methodology for extracting the significant features in ML. The results indicated that gradient boosting is a marginally superior predictive model, with 2–3% higher accuracy and area under the receiver operating characteristic curve (AUC) compared to logistic and ordinal regression. Interestingly, concerning low career satisfaction, those with the worst scores for the phrase “how frequently applied knowledge, skills, or technological tools from the academic training” were less satisfied with their careers. To summarize, career satisfaction is related to academic training, alumni satisfaction, employment status, published articles or books, and other factors.","PeriodicalId":507941,"journal":{"name":"Informatics","volume":"66 50","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring the Relationship between Career Satisfaction and University Learning Using Data Science Models\",\"authors\":\"Sofía Ramos-Pulido, Neil Hernández-Gress, Gabriela Torres-Delgado\",\"doi\":\"10.3390/informatics11010006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current research on the career satisfaction of graduates limits educational institutions in devising methods to attain high career satisfaction. Thus, this study aims to use data science models to understand and predict career satisfaction based on information collected from surveys of university alumni. Five machine learning (ML) algorithms were used for data analysis, including the decision tree, random forest, gradient boosting, support vector machine, and neural network models. To achieve optimal prediction performance, we utilized the Bayesian optimization method to fine-tune the parameters of the five ML algorithms. The five ML models were compared with logistic and ordinal regression. Then, to extract the most important features of the best predictive model, we employed the SHapley Additive exPlanations (SHAP), a novel methodology for extracting the significant features in ML. The results indicated that gradient boosting is a marginally superior predictive model, with 2–3% higher accuracy and area under the receiver operating characteristic curve (AUC) compared to logistic and ordinal regression. Interestingly, concerning low career satisfaction, those with the worst scores for the phrase “how frequently applied knowledge, skills, or technological tools from the academic training” were less satisfied with their careers. To summarize, career satisfaction is related to academic training, alumni satisfaction, employment status, published articles or books, and other factors.\",\"PeriodicalId\":507941,\"journal\":{\"name\":\"Informatics\",\"volume\":\"66 50\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/informatics11010006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/informatics11010006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
目前对毕业生职业满意度的研究限制了教育机构制定获得高职业满意度的方法。因此,本研究旨在根据对大学校友调查收集到的信息,利用数据科学模型来了解和预测职业满意度。数据分析采用了五种机器学习(ML)算法,包括决策树、随机森林、梯度提升、支持向量机和神经网络模型。为了达到最佳预测效果,我们利用贝叶斯优化法对五种 ML 算法的参数进行了微调。我们将五种 ML 模型与逻辑回归和序数回归进行了比较。然后,为了提取最佳预测模型中最重要的特征,我们采用了 SHapley Additive exPlanations(SHAP),这是一种提取 ML 中重要特征的新方法。结果表明,梯度提升是一种略胜一筹的预测模型,与逻辑回归和序数回归相比,其准确率和接受者工作特征曲线下面积(AUC)高出 2-3%。有趣的是,关于职业满意度低的问题,在 "如何经常应用学术培训中的知识、技能或技术工具 "这一短语上得分最差的人对自己的职业满意度较低。总之,职业满意度与学术培训、校友满意度、就业状况、发表的文章或书籍以及其他因素有关。
Exploring the Relationship between Career Satisfaction and University Learning Using Data Science Models
Current research on the career satisfaction of graduates limits educational institutions in devising methods to attain high career satisfaction. Thus, this study aims to use data science models to understand and predict career satisfaction based on information collected from surveys of university alumni. Five machine learning (ML) algorithms were used for data analysis, including the decision tree, random forest, gradient boosting, support vector machine, and neural network models. To achieve optimal prediction performance, we utilized the Bayesian optimization method to fine-tune the parameters of the five ML algorithms. The five ML models were compared with logistic and ordinal regression. Then, to extract the most important features of the best predictive model, we employed the SHapley Additive exPlanations (SHAP), a novel methodology for extracting the significant features in ML. The results indicated that gradient boosting is a marginally superior predictive model, with 2–3% higher accuracy and area under the receiver operating characteristic curve (AUC) compared to logistic and ordinal regression. Interestingly, concerning low career satisfaction, those with the worst scores for the phrase “how frequently applied knowledge, skills, or technological tools from the academic training” were less satisfied with their careers. To summarize, career satisfaction is related to academic training, alumni satisfaction, employment status, published articles or books, and other factors.