Helia Farhood, I. Joudah, Amin Beheshti, Samuel Muller
{"title":"评估和改进用于预测学生学习成果的人工智能模型","authors":"Helia Farhood, I. Joudah, Amin Beheshti, Samuel Muller","doi":"10.3390/informatics11030046","DOIUrl":null,"url":null,"abstract":"Predicting student outcomes is an essential task and a central challenge among artificial intelligence-based personalised learning applications. Despite several studies exploring student performance prediction, there is a notable lack of comprehensive and comparative research that methodically evaluates and compares multiple machine learning models alongside deep learning architectures. In response, our research provides a comprehensive comparison to evaluate and improve ten different machine learning and deep learning models, either well-established or cutting-edge techniques, namely, random forest, decision tree, support vector machine, K-nearest neighbours classifier, logistic regression, linear regression, and state-of-the-art extreme gradient boosting (XGBoost), as well as a fully connected feed-forward neural network, a convolutional neural network, and a gradient-boosted neural network. We implemented and fine-tuned these models using Python 3.9.5. With a keen emphasis on prediction accuracy and model performance optimisation, we evaluate these methodologies across two benchmark public student datasets. We employ a dual evaluation approach, utilising both k-fold cross-validation and holdout methods, to comprehensively assess the models’ performance. Our research focuses primarily on predicting student outcomes in final examinations by determining their success or failure. Moreover, we explore the importance of feature selection using the ubiquitous Lasso for dimensionality reduction to improve model efficiency, prevent overfitting, and examine its impact on prediction accuracy for each model, both with and without Lasso. This study provides valuable guidance for selecting and deploying predictive models for tabular data classification like student outcome prediction, which seeks to utilise data-driven insights for personalised education.","PeriodicalId":507941,"journal":{"name":"Informatics","volume":"27 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating and Enhancing Artificial Intelligence Models for Predicting Student Learning Outcomes\",\"authors\":\"Helia Farhood, I. Joudah, Amin Beheshti, Samuel Muller\",\"doi\":\"10.3390/informatics11030046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Predicting student outcomes is an essential task and a central challenge among artificial intelligence-based personalised learning applications. Despite several studies exploring student performance prediction, there is a notable lack of comprehensive and comparative research that methodically evaluates and compares multiple machine learning models alongside deep learning architectures. In response, our research provides a comprehensive comparison to evaluate and improve ten different machine learning and deep learning models, either well-established or cutting-edge techniques, namely, random forest, decision tree, support vector machine, K-nearest neighbours classifier, logistic regression, linear regression, and state-of-the-art extreme gradient boosting (XGBoost), as well as a fully connected feed-forward neural network, a convolutional neural network, and a gradient-boosted neural network. We implemented and fine-tuned these models using Python 3.9.5. With a keen emphasis on prediction accuracy and model performance optimisation, we evaluate these methodologies across two benchmark public student datasets. We employ a dual evaluation approach, utilising both k-fold cross-validation and holdout methods, to comprehensively assess the models’ performance. Our research focuses primarily on predicting student outcomes in final examinations by determining their success or failure. Moreover, we explore the importance of feature selection using the ubiquitous Lasso for dimensionality reduction to improve model efficiency, prevent overfitting, and examine its impact on prediction accuracy for each model, both with and without Lasso. This study provides valuable guidance for selecting and deploying predictive models for tabular data classification like student outcome prediction, which seeks to utilise data-driven insights for personalised education.\",\"PeriodicalId\":507941,\"journal\":{\"name\":\"Informatics\",\"volume\":\"27 8\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/informatics11030046\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/informatics11030046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
预测学生成绩是一项重要任务,也是基于人工智能的个性化学习应用所面临的核心挑战。尽管有多项研究对学生成绩预测进行了探索,但对多种机器学习模型和深度学习架构进行有条不紊的评估和比较的综合比较研究却明显缺乏。为此,我们的研究提供了一个全面的比较,以评估和改进十种不同的机器学习和深度学习模型,这些模型有的是成熟技术,有的是前沿技术,即随机森林、决策树、支持向量机、K-近邻分类器、逻辑回归、线性回归、最先进的极端梯度提升(XGBoost),以及全连接前馈神经网络、卷积神经网络和梯度提升神经网络。我们使用 Python 3.9.5 实现并微调了这些模型。我们以预测准确性和模型性能优化为重点,在两个基准公共学生数据集上对这些方法进行了评估。我们采用了双重评估方法,利用 k 倍交叉验证和保持方法来全面评估模型的性能。我们的研究主要侧重于通过确定学生的成败来预测学生在期末考试中的成绩。此外,我们还探讨了使用无处不在的 Lasso 进行特征选择以提高模型效率、防止过拟合的重要性,并考察了其对使用和不使用 Lasso 的每个模型的预测准确性的影响。这项研究为选择和部署预测模型提供了有价值的指导,这些模型适用于学生成绩预测等表格数据分类,旨在利用数据驱动的洞察力实现个性化教育。
Evaluating and Enhancing Artificial Intelligence Models for Predicting Student Learning Outcomes
Predicting student outcomes is an essential task and a central challenge among artificial intelligence-based personalised learning applications. Despite several studies exploring student performance prediction, there is a notable lack of comprehensive and comparative research that methodically evaluates and compares multiple machine learning models alongside deep learning architectures. In response, our research provides a comprehensive comparison to evaluate and improve ten different machine learning and deep learning models, either well-established or cutting-edge techniques, namely, random forest, decision tree, support vector machine, K-nearest neighbours classifier, logistic regression, linear regression, and state-of-the-art extreme gradient boosting (XGBoost), as well as a fully connected feed-forward neural network, a convolutional neural network, and a gradient-boosted neural network. We implemented and fine-tuned these models using Python 3.9.5. With a keen emphasis on prediction accuracy and model performance optimisation, we evaluate these methodologies across two benchmark public student datasets. We employ a dual evaluation approach, utilising both k-fold cross-validation and holdout methods, to comprehensively assess the models’ performance. Our research focuses primarily on predicting student outcomes in final examinations by determining their success or failure. Moreover, we explore the importance of feature selection using the ubiquitous Lasso for dimensionality reduction to improve model efficiency, prevent overfitting, and examine its impact on prediction accuracy for each model, both with and without Lasso. This study provides valuable guidance for selecting and deploying predictive models for tabular data classification like student outcome prediction, which seeks to utilise data-driven insights for personalised education.