Cody A. Coleman, Daniel T. Seaton, Isaac L. Chuang
{"title":"概率用例:发现用于预测认证的行为模式","authors":"Cody A. Coleman, Daniel T. Seaton, Isaac L. Chuang","doi":"10.1145/2724660.2724662","DOIUrl":null,"url":null,"abstract":"Advances in open-online education have led to a dramatic increase in the size, diversity, and traceability of learner populations, offering tremendous opportunities to study detailed learning behavior of users around the world. This paper adapts the topic modeling approach of Latent Dirichlet Allocation (LDA) to uncover behavioral structure from student logs in a MITx Massive Open Online Course, 8.02x: Electricity and Magnetism. LDA is typically found in the field of natural language processing, where it identifies the latent topic structure within a collection of documents. However, this framework can be adapted for analysis of user-behavioral patterns by considering user interactions with courseware as a ``bag of interactions'' equivalent to the ``bag of words'' model found in topic modeling. By employing this representation, LDA forms probabilistic use cases that clusters students based on their behavior. Through the probability distributions associated with each use case, this approach provides an interpretable representation of user access patterns, while reducing the dimensionality of the data and improving accuracy. Using only the first week of logs, we can predict whether or not a student will earn a certificate with 0.81 ± 0.01 cross-validation accuracy. Thus, the method presented in this paper is a powerful tool in understanding user behavior and predicting outcomes.","PeriodicalId":20664,"journal":{"name":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":"{\"title\":\"Probabilistic Use Cases: Discovering Behavioral Patterns for Predicting Certification\",\"authors\":\"Cody A. Coleman, Daniel T. Seaton, Isaac L. Chuang\",\"doi\":\"10.1145/2724660.2724662\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advances in open-online education have led to a dramatic increase in the size, diversity, and traceability of learner populations, offering tremendous opportunities to study detailed learning behavior of users around the world. This paper adapts the topic modeling approach of Latent Dirichlet Allocation (LDA) to uncover behavioral structure from student logs in a MITx Massive Open Online Course, 8.02x: Electricity and Magnetism. LDA is typically found in the field of natural language processing, where it identifies the latent topic structure within a collection of documents. However, this framework can be adapted for analysis of user-behavioral patterns by considering user interactions with courseware as a ``bag of interactions'' equivalent to the ``bag of words'' model found in topic modeling. By employing this representation, LDA forms probabilistic use cases that clusters students based on their behavior. Through the probability distributions associated with each use case, this approach provides an interpretable representation of user access patterns, while reducing the dimensionality of the data and improving accuracy. Using only the first week of logs, we can predict whether or not a student will earn a certificate with 0.81 ± 0.01 cross-validation accuracy. Thus, the method presented in this paper is a powerful tool in understanding user behavior and predicting outcomes.\",\"PeriodicalId\":20664,\"journal\":{\"name\":\"Proceedings of the Second (2015) ACM Conference on Learning @ Scale\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"56\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Second (2015) ACM Conference on Learning @ Scale\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2724660.2724662\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second (2015) ACM Conference on Learning @ Scale","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2724660.2724662","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Probabilistic Use Cases: Discovering Behavioral Patterns for Predicting Certification
Advances in open-online education have led to a dramatic increase in the size, diversity, and traceability of learner populations, offering tremendous opportunities to study detailed learning behavior of users around the world. This paper adapts the topic modeling approach of Latent Dirichlet Allocation (LDA) to uncover behavioral structure from student logs in a MITx Massive Open Online Course, 8.02x: Electricity and Magnetism. LDA is typically found in the field of natural language processing, where it identifies the latent topic structure within a collection of documents. However, this framework can be adapted for analysis of user-behavioral patterns by considering user interactions with courseware as a ``bag of interactions'' equivalent to the ``bag of words'' model found in topic modeling. By employing this representation, LDA forms probabilistic use cases that clusters students based on their behavior. Through the probability distributions associated with each use case, this approach provides an interpretable representation of user access patterns, while reducing the dimensionality of the data and improving accuracy. Using only the first week of logs, we can predict whether or not a student will earn a certificate with 0.81 ± 0.01 cross-validation accuracy. Thus, the method presented in this paper is a powerful tool in understanding user behavior and predicting outcomes.