{"title":"Detecting Student at Risk of Failure: A Case Study of Conceptualizing Mining from Internet Access Log Files","authors":"Ruangsak Trakunphutthirak, Y. Cheung, V. Lee","doi":"10.1109/ICDMW.2018.00060","DOIUrl":null,"url":null,"abstract":"Predicting student academic performance can be done by using educational data mining. Machine learning techniques play an important role for predicting academic performance from the large-scale data like the internet access log files from a university. Current data sources are mainly manual collections of data or data from a single unit of study. This study highlights the use of a new data source by transforming a university log file to predict academic performance. The log file comprises student internet access activities and browsing categories. To detect overall student academic performance, we select the best prediction accuracy by enhancing two datasets and comparing different weights in the time and frequency domains. We found that the random forest technique provides the best way in these datasets to predict students at risk-of-failure. We also found that data from internet access activities reveals a better accuracy than data from browsing categories. The combination of two datasets reveals a better picture of students' internet utilization and thus indicates how students at risk-of-failure can be detected by their internet access activities and browsing behavior.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2018.00060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Predicting student academic performance can be done by using educational data mining. Machine learning techniques play an important role for predicting academic performance from the large-scale data like the internet access log files from a university. Current data sources are mainly manual collections of data or data from a single unit of study. This study highlights the use of a new data source by transforming a university log file to predict academic performance. The log file comprises student internet access activities and browsing categories. To detect overall student academic performance, we select the best prediction accuracy by enhancing two datasets and comparing different weights in the time and frequency domains. We found that the random forest technique provides the best way in these datasets to predict students at risk-of-failure. We also found that data from internet access activities reveals a better accuracy than data from browsing categories. The combination of two datasets reveals a better picture of students' internet utilization and thus indicates how students at risk-of-failure can be detected by their internet access activities and browsing behavior.