Bilal I Al-Ahmad, Abdullah Alzaqebah, Rami Alkhawaldeh, Ala' M Al-Zoubi, Hsuehi Lo, Adel Ali
{"title":"大学学生学业表现预测:来自圣克劳德州立大学的案例研究。","authors":"Bilal I Al-Ahmad, Abdullah Alzaqebah, Rami Alkhawaldeh, Ala' M Al-Zoubi, Hsuehi Lo, Adel Ali","doi":"10.7717/peerj-cs.3087","DOIUrl":null,"url":null,"abstract":"<p><p>Predicting students' performance is one of the essential educational data mining approaches aimed at observing learning outcomes. Predicting grade point average (GPA) helps to monitor academic performance and assists advisors in identifying students at risk of failure, major changes, or dropout. To enhance prediction performance, this study employs a long short-term memory (LSTM) model using a rich set of academic and demographic features. The dataset, drawn from 29,455 students at Saint Cloud State University (SCSU) over eight years (2016-2024), was carefully preprocessed by eliminating irrelevant and missing data, encoding categorical variables, and normalizing numerical features. Feature importance was determined using a permutation-based method to identify the most impactful variables on term GPA prediction. Furthermore, model hyperparameters, including the number of LSTM layers, units per layer, batch size, learning rate, and activation functions, were fine-tuned using experimental validation with the Adam optimizer and learning rate scheduling. Two experiments were conducted at both the college and department levels. The proposed model outperformed traditional machine learning models such as linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), random forest (RF), and support vector regressor (SVR), and it surpasses two deep learning models, recurrent neural network (RNN) and convolutional neural network (CNN), achieving 9.54 mean absolute percentage error (MAPE), 0.0059 mean absolute error (MAE), 0.0001 root mean square error (RMSE), and an R² score of 99%.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3087"},"PeriodicalIF":2.5000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453804/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting academic performance for students' university: case study from Saint Cloud State University.\",\"authors\":\"Bilal I Al-Ahmad, Abdullah Alzaqebah, Rami Alkhawaldeh, Ala' M Al-Zoubi, Hsuehi Lo, Adel Ali\",\"doi\":\"10.7717/peerj-cs.3087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Predicting students' performance is one of the essential educational data mining approaches aimed at observing learning outcomes. Predicting grade point average (GPA) helps to monitor academic performance and assists advisors in identifying students at risk of failure, major changes, or dropout. To enhance prediction performance, this study employs a long short-term memory (LSTM) model using a rich set of academic and demographic features. The dataset, drawn from 29,455 students at Saint Cloud State University (SCSU) over eight years (2016-2024), was carefully preprocessed by eliminating irrelevant and missing data, encoding categorical variables, and normalizing numerical features. Feature importance was determined using a permutation-based method to identify the most impactful variables on term GPA prediction. Furthermore, model hyperparameters, including the number of LSTM layers, units per layer, batch size, learning rate, and activation functions, were fine-tuned using experimental validation with the Adam optimizer and learning rate scheduling. Two experiments were conducted at both the college and department levels. The proposed model outperformed traditional machine learning models such as linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), random forest (RF), and support vector regressor (SVR), and it surpasses two deep learning models, recurrent neural network (RNN) and convolutional neural network (CNN), achieving 9.54 mean absolute percentage error (MAPE), 0.0059 mean absolute error (MAE), 0.0001 root mean square error (RMSE), and an R² score of 99%.</p>\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"11 \",\"pages\":\"e3087\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453804/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.3087\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.3087","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Predicting academic performance for students' university: case study from Saint Cloud State University.
Predicting students' performance is one of the essential educational data mining approaches aimed at observing learning outcomes. Predicting grade point average (GPA) helps to monitor academic performance and assists advisors in identifying students at risk of failure, major changes, or dropout. To enhance prediction performance, this study employs a long short-term memory (LSTM) model using a rich set of academic and demographic features. The dataset, drawn from 29,455 students at Saint Cloud State University (SCSU) over eight years (2016-2024), was carefully preprocessed by eliminating irrelevant and missing data, encoding categorical variables, and normalizing numerical features. Feature importance was determined using a permutation-based method to identify the most impactful variables on term GPA prediction. Furthermore, model hyperparameters, including the number of LSTM layers, units per layer, batch size, learning rate, and activation functions, were fine-tuned using experimental validation with the Adam optimizer and learning rate scheduling. Two experiments were conducted at both the college and department levels. The proposed model outperformed traditional machine learning models such as linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), random forest (RF), and support vector regressor (SVR), and it surpasses two deep learning models, recurrent neural network (RNN) and convolutional neural network (CNN), achieving 9.54 mean absolute percentage error (MAPE), 0.0059 mean absolute error (MAE), 0.0001 root mean square error (RMSE), and an R² score of 99%.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.