Comparative Analysis of Prediction Techniques to Determine Student Dropout: Logistic Regression vs Decision Trees

Alfredo Perez, Elizabeth E. Grandón, Mónica Caniupán, Gilda Vargas
{"title":"Comparative Analysis of Prediction Techniques to Determine Student Dropout: Logistic Regression vs Decision Trees","authors":"Alfredo Perez, Elizabeth E. Grandón, Mónica Caniupán, Gilda Vargas","doi":"10.1109/SCCC.2018.8705262","DOIUrl":null,"url":null,"abstract":"Currently, the detection of students who may drop out from an academic program is a relevant issue for universities, so there are efforts to examine the variables that determine students' drop out. Drop out is defined in different ways, however, all the studies converge in that for a student to drop out a course of study, some variables must be combined. This study presents a comparison of performance indicators of the current drop out model of the Universidad del Bío-Bío (UBB), which is based on logistic regression technique and it is compared with a new model based on decision trees. The new model is obtained through data mining methodologies and it was implemented through the SAP Predictive Analytics tool. To train, validate, and apply the model, real data from the UBB databases were used. The comparison shows that the prediction of student´ drop out of the proposed model obtains an accuracy of 86%, a precision of 97% with an error rate of 14%, better indicators than the current values delivered by the model based on logistic regression. Subsequently, the prediction model obtained was optimized considering other variables, improving even more the prediction indicators. Higher education institutions should take into account the variables that explain the most the phenomenon of student´s drop out to improve the retention of their students.","PeriodicalId":235495,"journal":{"name":"2018 37th International Conference of the Chilean Computer Science Society (SCCC)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 37th International Conference of the Chilean Computer Science Society (SCCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2018.8705262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Currently, the detection of students who may drop out from an academic program is a relevant issue for universities, so there are efforts to examine the variables that determine students' drop out. Drop out is defined in different ways, however, all the studies converge in that for a student to drop out a course of study, some variables must be combined. This study presents a comparison of performance indicators of the current drop out model of the Universidad del Bío-Bío (UBB), which is based on logistic regression technique and it is compared with a new model based on decision trees. The new model is obtained through data mining methodologies and it was implemented through the SAP Predictive Analytics tool. To train, validate, and apply the model, real data from the UBB databases were used. The comparison shows that the prediction of student´ drop out of the proposed model obtains an accuracy of 86%, a precision of 97% with an error rate of 14%, better indicators than the current values delivered by the model based on logistic regression. Subsequently, the prediction model obtained was optimized considering other variables, improving even more the prediction indicators. Higher education institutions should take into account the variables that explain the most the phenomenon of student´s drop out to improve the retention of their students.
确定学生辍学预测技术的比较分析:逻辑回归与决策树
目前,对于大学来说,检测可能从学术课程退学的学生是一个相关问题,因此有人努力检查决定学生退学的变量。辍学有不同的定义,然而,所有的研究都一致认为,一个学生要想退出一门课程,必须结合一些变量。本文对目前基于逻辑回归技术的universsidad del Bío-Bío (UBB)退学模型的绩效指标进行了比较,并与基于决策树的新模型进行了比较。该模型通过数据挖掘方法得到,并通过SAP预测分析工具实现。为了训练、验证和应用模型,使用了来自UBB数据库的真实数据。对比表明,该模型预测学生辍学的准确率为86%,精密度为97%,错误率为14%,优于目前基于逻辑回归的模型所提供的指标。随后,考虑其他变量对得到的预测模型进行优化,进一步提高了预测指标。高等教育机构应该考虑到最能解释学生辍学现象的变量,以提高学生的保留率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信