{"title":"A model for predicting dropout of higher education students","authors":"Anaíle Mendes Rabelo, Luis Enrique Zárate","doi":"10.1016/j.dsm.2024.07.001","DOIUrl":null,"url":null,"abstract":"<div><div>Higher education institutions are becoming increasingly concerned with the retention of their students. This work is motivated by the interest in predicting and reducing student dropout, and consequently in reducing the financial losses of said institutions. Based on the characterization of the dropout problem and the application of a knowledge discovery process, an ensemble model is proposed to improve dropout prediction. The ensemble model combines the results of three models: Logistic Regression, Neural Networks, and Decision Tree. As a result, the model can correctly classify 89% of the students as enrolled or dropped and accurately identify 98.1% of dropouts. When compared with the Random Forest ensemble method, the proposed model demonstrates desirable characteristics to assist management in proposing actions to retain students.</div></div>","PeriodicalId":100353,"journal":{"name":"Data Science and Management","volume":"8 1","pages":"Pages 72-85"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science and Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666764924000341","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Higher education institutions are becoming increasingly concerned with the retention of their students. This work is motivated by the interest in predicting and reducing student dropout, and consequently in reducing the financial losses of said institutions. Based on the characterization of the dropout problem and the application of a knowledge discovery process, an ensemble model is proposed to improve dropout prediction. The ensemble model combines the results of three models: Logistic Regression, Neural Networks, and Decision Tree. As a result, the model can correctly classify 89% of the students as enrolled or dropped and accurately identify 98.1% of dropouts. When compared with the Random Forest ensemble method, the proposed model demonstrates desirable characteristics to assist management in proposing actions to retain students.