Ricardo Ordoñez-Avila, Jaime Meza, Sebastian Ventura
{"title":"在线高等教育中自主学生模式的挖掘。","authors":"Ricardo Ordoñez-Avila, Jaime Meza, Sebastian Ventura","doi":"10.7717/peerj-cs.2855","DOIUrl":null,"url":null,"abstract":"<p><p>Higher education institutions actively integrate information and communication technologies through learning management systems (LMS), which are crucial for online education. This study used data mining techniques to predict the autonomous scores of students in the online Law and Psychology programs at the Technical University of Manabi. The process involved data integration and selection of more than 16,000 records, preprocessing, transformation with RobustScaler, predictive modelling that included recursive feature elimination with cross-validation to select features (RFEcv), and hyperparameter fitting to achieve the best fit, and finally, evaluation of the models using metrics of root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R<sup>2</sup>). The feature selection framework suggested by RFEcv contributed to the performance of the models. The variables analyzed focused on download rate, homework submission rate, test performance rate, median daily accesses, median days of access per month, observation of comments on teacher-reviewed assignments, length of final exam, and not requiring the supplemental exam. Hyperparameter adjustment improved the performance of the models after applying RFEcv. The models evaluated showed minimal differences in RMSE ([0.5411 .. 0.6025]). The gradient boosting model achieved the best performance of R<sup>2</sup> = 0.6693, MAE = 0.4041 and RMSE = 0.5411 with the Law online program data, as with the Psychology online program data, with an R<sup>2</sup> = 0.6418, MAE = 0.4232 and RMSE = 0.6025, while the combination of both data sets reflected the best performance with the extreme gradient boosting (XGBoost) model with the values of R<sup>2</sup> = 0.6294, MAE = 0.4295 and RMSE = 0.5985. Future research and implementations could include autonomous score data through plugins and reports integrated into LMSs. This approach may provide indicators of interest for understanding and improving online learning from a personalized, real-time perspective.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2855"},"PeriodicalIF":3.5000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12193002/pdf/","citationCount":"0","resultStr":"{\"title\":\"Mining autonomous student patterns score on LMS within online higher education.\",\"authors\":\"Ricardo Ordoñez-Avila, Jaime Meza, Sebastian Ventura\",\"doi\":\"10.7717/peerj-cs.2855\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Higher education institutions actively integrate information and communication technologies through learning management systems (LMS), which are crucial for online education. This study used data mining techniques to predict the autonomous scores of students in the online Law and Psychology programs at the Technical University of Manabi. The process involved data integration and selection of more than 16,000 records, preprocessing, transformation with RobustScaler, predictive modelling that included recursive feature elimination with cross-validation to select features (RFEcv), and hyperparameter fitting to achieve the best fit, and finally, evaluation of the models using metrics of root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R<sup>2</sup>). The feature selection framework suggested by RFEcv contributed to the performance of the models. The variables analyzed focused on download rate, homework submission rate, test performance rate, median daily accesses, median days of access per month, observation of comments on teacher-reviewed assignments, length of final exam, and not requiring the supplemental exam. Hyperparameter adjustment improved the performance of the models after applying RFEcv. The models evaluated showed minimal differences in RMSE ([0.5411 .. 0.6025]). The gradient boosting model achieved the best performance of R<sup>2</sup> = 0.6693, MAE = 0.4041 and RMSE = 0.5411 with the Law online program data, as with the Psychology online program data, with an R<sup>2</sup> = 0.6418, MAE = 0.4232 and RMSE = 0.6025, while the combination of both data sets reflected the best performance with the extreme gradient boosting (XGBoost) model with the values of R<sup>2</sup> = 0.6294, MAE = 0.4295 and RMSE = 0.5985. Future research and implementations could include autonomous score data through plugins and reports integrated into LMSs. This approach may provide indicators of interest for understanding and improving online learning from a personalized, real-time perspective.</p>\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"11 \",\"pages\":\"e2855\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12193002/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.2855\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2855","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
高等教育机构通过学习管理系统(LMS)积极整合信息和通信技术,这对在线教育至关重要。这项研究使用数据挖掘技术来预测马纳比技术大学在线法律和心理学课程学生的自主分数。该过程包括数据集成和选择超过16,000条记录,预处理,使用RobustScaler进行转换,预测建模,包括递归特征消除和交叉验证以选择特征(RFEcv),以及超参数拟合以实现最佳拟合,最后使用均方根误差(RMSE),平均绝对误差(MAE)和决定系数(R2)的度量来评估模型。RFEcv提出的特征选择框架有助于提高模型的性能。分析的变量集中在下载率、家庭作业提交率、考试表现率、每日访问的中位数、每月访问的中位数天数、对教师评议作业的评论观察、期末考试的长度以及不需要补充考试。应用RFEcv后,超参数平差提高了模型的性能。评估的模型显示RMSE差异极小([0.5411 .]0.6025])。梯度提升模型在Law在线课程数据上的表现最佳,R2 = 0.6693, MAE = 0.4041, RMSE = 0.5411;在Psychology在线课程数据上的表现最佳,R2 = 0.6418, MAE = 0.4232, RMSE = 0.6025,而在两组数据的组合上,极值梯度提升(XGBoost)模型的表现最佳,R2 = 0.6294, MAE = 0.4295, RMSE = 0.5985。未来的研究和实现可能包括通过集成到lms中的插件和报告来获得自主分数数据。这种方法可以从个性化、实时的角度为理解和改进在线学习提供兴趣指标。
Mining autonomous student patterns score on LMS within online higher education.
Higher education institutions actively integrate information and communication technologies through learning management systems (LMS), which are crucial for online education. This study used data mining techniques to predict the autonomous scores of students in the online Law and Psychology programs at the Technical University of Manabi. The process involved data integration and selection of more than 16,000 records, preprocessing, transformation with RobustScaler, predictive modelling that included recursive feature elimination with cross-validation to select features (RFEcv), and hyperparameter fitting to achieve the best fit, and finally, evaluation of the models using metrics of root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2). The feature selection framework suggested by RFEcv contributed to the performance of the models. The variables analyzed focused on download rate, homework submission rate, test performance rate, median daily accesses, median days of access per month, observation of comments on teacher-reviewed assignments, length of final exam, and not requiring the supplemental exam. Hyperparameter adjustment improved the performance of the models after applying RFEcv. The models evaluated showed minimal differences in RMSE ([0.5411 .. 0.6025]). The gradient boosting model achieved the best performance of R2 = 0.6693, MAE = 0.4041 and RMSE = 0.5411 with the Law online program data, as with the Psychology online program data, with an R2 = 0.6418, MAE = 0.4232 and RMSE = 0.6025, while the combination of both data sets reflected the best performance with the extreme gradient boosting (XGBoost) model with the values of R2 = 0.6294, MAE = 0.4295 and RMSE = 0.5985. Future research and implementations could include autonomous score data through plugins and reports integrated into LMSs. This approach may provide indicators of interest for understanding and improving online learning from a personalized, real-time perspective.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.