F. Pereira, Elaine H. T. Oliveira, David Fernandes, A. Cristea
{"title":"结合机器学习和进化算法的CS1课程学生早期表现预测","authors":"F. Pereira, Elaine H. T. Oliveira, David Fernandes, A. Cristea","doi":"10.1109/ICALT.2019.00066","DOIUrl":null,"url":null,"abstract":"Many researchers have started extracting student behaviour by cleaning data collected from web environments and using it as features in machine learning (ML) models. Using log data collected from an online judge, we have compiled a set of successful features correlated with the student grade and applying them on a database representing 486 CS1 students. We used this set of features in ML pipelines which were optimised, featuring a combination of an automated approach with an evolutionary algorithm and hyperparameter-tuning with random search. As a result, we achieved an accuracy of 75.55%, using data from only the first two weeks to predict the student final grades. We show how our pipeline outperforms state-of-the-art work on similar scenarios.","PeriodicalId":356549,"journal":{"name":"2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"Early Performance Prediction for CS1 Course Students using a Combination of Machine Learning and an Evolutionary Algorithm\",\"authors\":\"F. Pereira, Elaine H. T. Oliveira, David Fernandes, A. Cristea\",\"doi\":\"10.1109/ICALT.2019.00066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many researchers have started extracting student behaviour by cleaning data collected from web environments and using it as features in machine learning (ML) models. Using log data collected from an online judge, we have compiled a set of successful features correlated with the student grade and applying them on a database representing 486 CS1 students. We used this set of features in ML pipelines which were optimised, featuring a combination of an automated approach with an evolutionary algorithm and hyperparameter-tuning with random search. As a result, we achieved an accuracy of 75.55%, using data from only the first two weeks to predict the student final grades. We show how our pipeline outperforms state-of-the-art work on similar scenarios.\",\"PeriodicalId\":356549,\"journal\":{\"name\":\"2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICALT.2019.00066\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALT.2019.00066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Early Performance Prediction for CS1 Course Students using a Combination of Machine Learning and an Evolutionary Algorithm
Many researchers have started extracting student behaviour by cleaning data collected from web environments and using it as features in machine learning (ML) models. Using log data collected from an online judge, we have compiled a set of successful features correlated with the student grade and applying them on a database representing 486 CS1 students. We used this set of features in ML pipelines which were optimised, featuring a combination of an automated approach with an evolutionary algorithm and hyperparameter-tuning with random search. As a result, we achieved an accuracy of 75.55%, using data from only the first two weeks to predict the student final grades. We show how our pipeline outperforms state-of-the-art work on similar scenarios.