Effective Feature Learning with Unsupervised Learning for Improving the Predictive Models in Massive Open Online Courses

Mucong Ding, Kai Yang, D. Yeung, T. Pong
{"title":"Effective Feature Learning with Unsupervised Learning for Improving the Predictive Models in Massive Open Online Courses","authors":"Mucong Ding, Kai Yang, D. Yeung, T. Pong","doi":"10.1145/3303772.3303795","DOIUrl":null,"url":null,"abstract":"The effectiveness of learning in massive open online courses (MOOCs) can be significantly enhanced by introducing personalized intervention schemes which rely on building predictive models of student learning behaviors such as some engagement or performance indicators. A major challenge that has to be addressed when building such models is to design handcrafted features that are effective for the prediction task at hand. In this paper, we make the first attempt to solve the feature learning problem by taking the unsupervised learning approach to learn a compact representation of the raw features with a large degree of redundancy. Specifically, in order to capture the underlying learning patterns in the content domain and the temporal nature of the clickstream data, we train a modified auto-encoder (AE) combined with the long short-term memory (LSTM) network to obtain a fixed-length embedding for each input sequence. When compared with the original features, the new features that correspond to the embedding obtained by the modified LSTM-AE are not only more parsimonious but also more discriminative for our prediction task. Using simple supervised learning models, the learned features can improve the prediction accuracy by up to 17% compared with the supervised neural networks and reduce overfitting to the dominant low-performing group of students, specifically in the task of predicting students' performance. Our approach is generic in the sense that it is not restricted to a specific supervised learning model nor a specific prediction task for MOOC learning analytics.","PeriodicalId":382957,"journal":{"name":"Proceedings of the 9th International Conference on Learning Analytics & Knowledge","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Conference on Learning Analytics & Knowledge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3303772.3303795","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

The effectiveness of learning in massive open online courses (MOOCs) can be significantly enhanced by introducing personalized intervention schemes which rely on building predictive models of student learning behaviors such as some engagement or performance indicators. A major challenge that has to be addressed when building such models is to design handcrafted features that are effective for the prediction task at hand. In this paper, we make the first attempt to solve the feature learning problem by taking the unsupervised learning approach to learn a compact representation of the raw features with a large degree of redundancy. Specifically, in order to capture the underlying learning patterns in the content domain and the temporal nature of the clickstream data, we train a modified auto-encoder (AE) combined with the long short-term memory (LSTM) network to obtain a fixed-length embedding for each input sequence. When compared with the original features, the new features that correspond to the embedding obtained by the modified LSTM-AE are not only more parsimonious but also more discriminative for our prediction task. Using simple supervised learning models, the learned features can improve the prediction accuracy by up to 17% compared with the supervised neural networks and reduce overfitting to the dominant low-performing group of students, specifically in the task of predicting students' performance. Our approach is generic in the sense that it is not restricted to a specific supervised learning model nor a specific prediction task for MOOC learning analytics.
基于无监督学习的有效特征学习改进大规模在线开放课程预测模型
大规模在线开放课程(MOOCs)的学习效果可以通过引入个性化的干预方案来显著提高,这些干预方案依赖于建立学生学习行为的预测模型,如一些参与或绩效指标。在构建这样的模型时,必须解决的一个主要挑战是设计对手头的预测任务有效的手工特征。在本文中,我们首次尝试通过采用无监督学习方法来学习具有大冗余度的原始特征的紧凑表示来解决特征学习问题。具体来说,为了捕获内容域中的潜在学习模式和点击流数据的时间性质,我们训练了一个改进的自编码器(AE)与长短期记忆(LSTM)网络相结合,以获得每个输入序列的固定长度嵌入。与原始特征相比,改进后的LSTM-AE得到的与嵌入相对应的新特征不仅更简洁,而且对我们的预测任务更具判别性。使用简单的监督学习模型,与监督神经网络相比,学习到的特征可以将预测精度提高17%,并减少对主要低表现学生群体的过拟合,特别是在预测学生表现的任务中。我们的方法是通用的,因为它不局限于特定的监督学习模型,也不局限于MOOC学习分析的特定预测任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信