{"title":"基于XGBoost方法的宏观学生成绩预测","authors":"Kuan Yan","doi":"10.1109/CDS52072.2021.00084","DOIUrl":null,"url":null,"abstract":"Student performance prediction has attracted more and more attention in the educational data mining field in recent years. An accurate and useful forecast on student performance can play a huge role in many aspects, such as solving student dropout, allocating teaching resources reasonably, and improving teaching methods. In this paper, we employed an XGBoost-based method to forecast student performance. Instead of using individual students as samples, we used a novel educational dataset structured from a macro perspective, which rarely appeared in existing research. We used data cleaning, feature selection, and feature creation to increase the model's generalizability and the accuracy of the predictions. The XGBoost model achieved the best results than five other classic machine learning models (i.e., Random Forest, Lasso, Elastic Net, Support Vector Machine, and Decision Tree). It achieved a significant improvement in the R2 score by 6.3% to 12.1% on different sub-datasets. Furthermore, through feature importance analysis, we have drawn some forward-looking and meaningful conclusions.","PeriodicalId":380426,"journal":{"name":"2021 2nd International Conference on Computing and Data Science (CDS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Student Performance Prediction Using XGBoost Method from A Macro Perspective\",\"authors\":\"Kuan Yan\",\"doi\":\"10.1109/CDS52072.2021.00084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Student performance prediction has attracted more and more attention in the educational data mining field in recent years. An accurate and useful forecast on student performance can play a huge role in many aspects, such as solving student dropout, allocating teaching resources reasonably, and improving teaching methods. In this paper, we employed an XGBoost-based method to forecast student performance. Instead of using individual students as samples, we used a novel educational dataset structured from a macro perspective, which rarely appeared in existing research. We used data cleaning, feature selection, and feature creation to increase the model's generalizability and the accuracy of the predictions. The XGBoost model achieved the best results than five other classic machine learning models (i.e., Random Forest, Lasso, Elastic Net, Support Vector Machine, and Decision Tree). It achieved a significant improvement in the R2 score by 6.3% to 12.1% on different sub-datasets. Furthermore, through feature importance analysis, we have drawn some forward-looking and meaningful conclusions.\",\"PeriodicalId\":380426,\"journal\":{\"name\":\"2021 2nd International Conference on Computing and Data Science (CDS)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 2nd International Conference on Computing and Data Science (CDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CDS52072.2021.00084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Conference on Computing and Data Science (CDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDS52072.2021.00084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Student Performance Prediction Using XGBoost Method from A Macro Perspective
Student performance prediction has attracted more and more attention in the educational data mining field in recent years. An accurate and useful forecast on student performance can play a huge role in many aspects, such as solving student dropout, allocating teaching resources reasonably, and improving teaching methods. In this paper, we employed an XGBoost-based method to forecast student performance. Instead of using individual students as samples, we used a novel educational dataset structured from a macro perspective, which rarely appeared in existing research. We used data cleaning, feature selection, and feature creation to increase the model's generalizability and the accuracy of the predictions. The XGBoost model achieved the best results than five other classic machine learning models (i.e., Random Forest, Lasso, Elastic Net, Support Vector Machine, and Decision Tree). It achieved a significant improvement in the R2 score by 6.3% to 12.1% on different sub-datasets. Furthermore, through feature importance analysis, we have drawn some forward-looking and meaningful conclusions.