不要相信一个模型，因为它是自信的:发现和描述未知的未知因素对在线学习中学生成功的预测

LAK23: 13th International Learning Analytics and Knowledge Conference Pub Date : 2022-12-16 DOI:10.1145/3576050.3576148

Roberta Galici, Tanja Kaser, G. Fenu, M. Marras

{"title":"不要相信一个模型，因为它是自信的:发现和描述未知的未知因素对在线学习中学生成功的预测","authors":"Roberta Galici, Tanja Kaser, G. Fenu, M. Marras","doi":"10.1145/3576050.3576148","DOIUrl":null,"url":null,"abstract":"Student success models might be prone to develop weak spots, i.e., examples hard to accurately classify due to insufficient representation during model creation. This weakness is one of the main factors undermining users’ trust, since model predictions could for instance lead an instructor to not intervene on a student in need. In this paper, we unveil the need of detecting and characterizing unknown unknowns in student success prediction in order to better understand when models may fail. Unknown unknowns include the students for which the model is highly confident in its predictions, but is actually wrong. Therefore, we cannot solely rely on the model’s confidence when evaluating the predictions quality. We first introduce a framework for the identification and characterization of unknown unknowns. We then assess its informativeness on log data collected from flipped courses and online courses using quantitative analyses and interviews with instructors. Our results show that unknown unknowns are a critical issue in this domain and that our framework can be applied to support their detection. The source code is available at https://github.com/epfl-ml4ed/unknown-unknowns.","PeriodicalId":394433,"journal":{"name":"LAK23: 13th International Learning Analytics and Knowledge Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Do Not Trust a Model Because It is Confident: Uncovering and Characterizing Unknown Unknowns to Student Success Predictors in Online-Based Learning\",\"authors\":\"Roberta Galici, Tanja Kaser, G. Fenu, M. Marras\",\"doi\":\"10.1145/3576050.3576148\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Student success models might be prone to develop weak spots, i.e., examples hard to accurately classify due to insufficient representation during model creation. This weakness is one of the main factors undermining users’ trust, since model predictions could for instance lead an instructor to not intervene on a student in need. In this paper, we unveil the need of detecting and characterizing unknown unknowns in student success prediction in order to better understand when models may fail. Unknown unknowns include the students for which the model is highly confident in its predictions, but is actually wrong. Therefore, we cannot solely rely on the model’s confidence when evaluating the predictions quality. We first introduce a framework for the identification and characterization of unknown unknowns. We then assess its informativeness on log data collected from flipped courses and online courses using quantitative analyses and interviews with instructors. Our results show that unknown unknowns are a critical issue in this domain and that our framework can be applied to support their detection. The source code is available at https://github.com/epfl-ml4ed/unknown-unknowns.\",\"PeriodicalId\":394433,\"journal\":{\"name\":\"LAK23: 13th International Learning Analytics and Knowledge Conference\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"LAK23: 13th International Learning Analytics and Knowledge Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3576050.3576148\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"LAK23: 13th International Learning Analytics and Knowledge Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3576050.3576148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

学生成功模型可能容易出现弱点，即在模型创建过程中，由于代表性不足，样本难以准确分类。这一弱点是破坏用户信任的主要因素之一，因为模型预测可能导致教师不干预有需要的学生。在本文中，我们揭示了在学生成功预测中检测和表征未知未知数的必要性，以便更好地理解模型何时可能失败。未知的未知数包括模型对其预测非常有信心，但实际上是错误的学生。因此，在评估预测质量时，我们不能仅仅依靠模型的置信度。我们首先介绍了一个未知未知的识别和表征框架。然后，我们利用定量分析和对教师的访谈，对从翻转课程和在线课程收集的日志数据进行信息评估。我们的研究结果表明，未知的未知数是该领域的一个关键问题，我们的框架可以应用于支持它们的检测。源代码可从https://github.com/epfl-ml4ed/unknown-unknowns获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Do Not Trust a Model Because It is Confident: Uncovering and Characterizing Unknown Unknowns to Student Success Predictors in Online-Based Learning

Student success models might be prone to develop weak spots, i.e., examples hard to accurately classify due to insufficient representation during model creation. This weakness is one of the main factors undermining users’ trust, since model predictions could for instance lead an instructor to not intervene on a student in need. In this paper, we unveil the need of detecting and characterizing unknown unknowns in student success prediction in order to better understand when models may fail. Unknown unknowns include the students for which the model is highly confident in its predictions, but is actually wrong. Therefore, we cannot solely rely on the model’s confidence when evaluating the predictions quality. We first introduce a framework for the identification and characterization of unknown unknowns. We then assess its informativeness on log data collected from flipped courses and online courses using quantitative analyses and interviews with instructors. Our results show that unknown unknowns are a critical issue in this domain and that our framework can be applied to support their detection. The source code is available at https://github.com/epfl-ml4ed/unknown-unknowns.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

LAK23: 13th International Learning Analytics and Knowledge Conference

自引率

0.00%

发文量