Balancing Performance and Explainability in Academic Dropout Prediction

IF 2.9 3区 教育学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Andrea Zanellati;Stefano Pio Zingaro;Maurizio Gabbrielli
{"title":"Balancing Performance and Explainability in Academic Dropout Prediction","authors":"Andrea Zanellati;Stefano Pio Zingaro;Maurizio Gabbrielli","doi":"10.1109/TLT.2024.3425959","DOIUrl":null,"url":null,"abstract":"Academic dropout remains a significant challenge for education systems, necessitating rigorous analysis and targeted interventions. This study employs machine learning techniques, specifically random forest (RF) and feature tokenizer transformer (FTT), to predict academic attrition. Utilizing a comprehensive dataset of over 40 000 students from an Italian university, the research incorporates a range of variables, including demographic information, prior educational metrics, and real-time academic performance indicators. We present a nuanced comparative evaluation of the RF and FTT models, highlighting their predictive accuracy and interpretative capabilities. Our empirical results demonstrate the effectiveness of machine learning in managing student attrition, with FTT models outperforming RF models in terms of predictive accuracy and achieving a sensitivity rate of 81%. Significantly, the inclusion of historical academic data enhances the models' ability to identify students at increased risk of dropping out. Furthermore, we apply advanced explanatory techniques, such as shapley additive explanations, to investigate the discriminative power of these models across different student profiles. This provides valuable insights into the key variables influencing dropout risk, contributing to a more holistic understanding of the issue. In addition, we conduct a fairness analysis to ensure the ethical robustness of our predictive models, making them not only effective but also equitable tools.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"17 ","pages":"2140-2153"},"PeriodicalIF":2.9000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10612222","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Learning Technologies","FirstCategoryId":"95","ListUrlMain":"https://ieeexplore.ieee.org/document/10612222/","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Academic dropout remains a significant challenge for education systems, necessitating rigorous analysis and targeted interventions. This study employs machine learning techniques, specifically random forest (RF) and feature tokenizer transformer (FTT), to predict academic attrition. Utilizing a comprehensive dataset of over 40 000 students from an Italian university, the research incorporates a range of variables, including demographic information, prior educational metrics, and real-time academic performance indicators. We present a nuanced comparative evaluation of the RF and FTT models, highlighting their predictive accuracy and interpretative capabilities. Our empirical results demonstrate the effectiveness of machine learning in managing student attrition, with FTT models outperforming RF models in terms of predictive accuracy and achieving a sensitivity rate of 81%. Significantly, the inclusion of historical academic data enhances the models' ability to identify students at increased risk of dropping out. Furthermore, we apply advanced explanatory techniques, such as shapley additive explanations, to investigate the discriminative power of these models across different student profiles. This provides valuable insights into the key variables influencing dropout risk, contributing to a more holistic understanding of the issue. In addition, we conduct a fairness analysis to ensure the ethical robustness of our predictive models, making them not only effective but also equitable tools.
在辍学预测中兼顾性能和可解释性
辍学仍然是教育系统面临的一个重大挑战,需要进行严格的分析和有针对性的干预。本研究采用机器学习技术,特别是随机森林(RF)和特征标记转换器(FTT)来预测学业流失。研究利用意大利一所大学 40,000 多名学生的综合数据集,纳入了一系列变量,包括人口统计信息、先前的教育指标和实时学业成绩指标。我们对 RF 模型和 FTT 模型进行了细致入微的比较评估,强调了它们的预测准确性和解释能力。我们的实证结果证明了机器学习在管理学生流失方面的有效性,FTT 模型在预测准确性方面优于 RF 模型,灵敏度高达 81%。值得注意的是,历史学业数据的加入增强了模型识别高辍学风险学生的能力。此外,我们还应用了先进的解释技术(如夏普利加法解释)来研究这些模型在不同学生情况下的判别能力。这为我们深入了解影响辍学风险的关键变量提供了宝贵的资料,有助于我们更全面地认识辍学问题。此外,我们还进行了公平性分析,以确保我们的预测模型在道德上的稳健性,使其不仅有效,而且成为公平的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Learning Technologies
IEEE Transactions on Learning Technologies COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
7.50
自引率
5.40%
发文量
82
审稿时长
>12 weeks
期刊介绍: The IEEE Transactions on Learning Technologies covers all advances in learning technologies and their applications, including but not limited to the following topics: innovative online learning systems; intelligent tutors; educational games; simulation systems for education and training; collaborative learning tools; learning with mobile devices; wearable devices and interfaces for learning; personalized and adaptive learning systems; tools for formative and summative assessment; tools for learning analytics and educational data mining; ontologies for learning systems; standards and web services that support learning; authoring tools for learning materials; computer support for peer tutoring; learning via computer-mediated inquiry, field, and lab work; social learning techniques; social networks and infrastructures for learning and knowledge sharing; and creation and management of learning objects.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信