{"title":"Mining knowledge components from many untagged questions","authors":"N. Zimmerman, R. Baker","doi":"10.1145/3027385.3029462","DOIUrl":null,"url":null,"abstract":"An ongoing study is being run to ensure that the McGraw-Hill Education LearnSmart platform teaches students as efficiently as possible. The first step in doing so is to identify what Knowledge Components (KCs) exist in the content; while the content is tagged by experts, these tags need to be re-calibrated periodically. LearnSmart courses are organized into chapters corresponding to those found in a textbook; each chapter can have anywhere from about a hundred to a few thousand questions. The KC extraction algorithms proposed by Barnes [1] and Desmarais et al [3] are applied on a chapter-by-chapter basis. To assess the ability of each mined q matrix to describe the observed learning, the PFA model of Pavlik et al [4] is fitted to it and a cross-validated AUC is calculated. The models are assessed based on whether PFA's predictions of student correctness are accurate. Early results show that both algorithms do a reasonable job of describing student progress, but q matrices with very different numbers of KCs fit observed data similarly well. Consequently, further consideration is required before automated extraction is practical in this context.","PeriodicalId":160897,"journal":{"name":"Proceedings of the Seventh International Learning Analytics & Knowledge Conference","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Seventh International Learning Analytics & Knowledge Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3027385.3029462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
An ongoing study is being run to ensure that the McGraw-Hill Education LearnSmart platform teaches students as efficiently as possible. The first step in doing so is to identify what Knowledge Components (KCs) exist in the content; while the content is tagged by experts, these tags need to be re-calibrated periodically. LearnSmart courses are organized into chapters corresponding to those found in a textbook; each chapter can have anywhere from about a hundred to a few thousand questions. The KC extraction algorithms proposed by Barnes [1] and Desmarais et al [3] are applied on a chapter-by-chapter basis. To assess the ability of each mined q matrix to describe the observed learning, the PFA model of Pavlik et al [4] is fitted to it and a cross-validated AUC is calculated. The models are assessed based on whether PFA's predictions of student correctness are accurate. Early results show that both algorithms do a reasonable job of describing student progress, but q matrices with very different numbers of KCs fit observed data similarly well. Consequently, further consideration is required before automated extraction is practical in this context.