Predicting Item Response Theory Parameters Using Question Statements Texts

Wemerson Marinho, E. W. Clua, Luis Martí, Karla Marinho
{"title":"Predicting Item Response Theory Parameters Using Question Statements Texts","authors":"Wemerson Marinho, E. W. Clua, Luis Martí, Karla Marinho","doi":"10.1145/3576050.3576139","DOIUrl":null,"url":null,"abstract":"Recently, new Neural Language Models pre-trained on a massive corpus of texts are available. These models encode statistical features of the languages through their parameters, creating better word vector representations that allow the training of neural networks with smaller sample sets. In this context, we investigate the application of these models to predict Item Response Theory parameters in multiple choice questions. More specifically, we apply our models for the Brazilian National High School Exam (ENEM) questions using the text of their statements and propose a novel optimization target for regression: Item Characteristic Curve. The architecture employed could predict the difficulty parameter b of the ENEM 2020 and 2021 items with a mean absolute error of 70 points. Calculating the IRT score in each knowledge area of the exam for a sample of 100,000 students, we obtained a mean absolute below 40 points for all knowledge areas. Considering only the top quartile, the exam’s main target of interest, the average error was less than 30 points for all areas, being the majority lower than 15 points. Such performance allows predicting parameters on newly created questions, composing mock tests for student training, and analyzing their performance with excellent precision, dispensing with the need for costly item calibration pre-test step.","PeriodicalId":394433,"journal":{"name":"LAK23: 13th International Learning Analytics and Knowledge Conference","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"LAK23: 13th International Learning Analytics and Knowledge Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3576050.3576139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, new Neural Language Models pre-trained on a massive corpus of texts are available. These models encode statistical features of the languages through their parameters, creating better word vector representations that allow the training of neural networks with smaller sample sets. In this context, we investigate the application of these models to predict Item Response Theory parameters in multiple choice questions. More specifically, we apply our models for the Brazilian National High School Exam (ENEM) questions using the text of their statements and propose a novel optimization target for regression: Item Characteristic Curve. The architecture employed could predict the difficulty parameter b of the ENEM 2020 and 2021 items with a mean absolute error of 70 points. Calculating the IRT score in each knowledge area of the exam for a sample of 100,000 students, we obtained a mean absolute below 40 points for all knowledge areas. Considering only the top quartile, the exam’s main target of interest, the average error was less than 30 points for all areas, being the majority lower than 15 points. Such performance allows predicting parameters on newly created questions, composing mock tests for student training, and analyzing their performance with excellent precision, dispensing with the need for costly item calibration pre-test step.
用问题陈述文本预测项目反应理论参数
最近,新的神经语言模型在大量文本语料库上进行了预训练。这些模型通过语言的参数编码语言的统计特征,创建更好的词向量表示,允许用更小的样本集训练神经网络。在此背景下,我们研究了这些模型在预测多项选择题中项目反应理论参数的应用。更具体地说,我们将我们的模型应用于巴西国家高中考试(ENEM)的问题,使用他们的陈述文本,并提出了一个新的回归优化目标:项目特征曲线。所采用的架构可以预测ENEM 2020和2021题难度参数b,平均绝对误差为70分。以10万名学生为样本,计算考试中每个知识领域的IRT分数,我们得到所有知识领域的平均绝对分数低于40分。仅考虑考试的主要目标——前四分之一,所有领域的平均误差小于30分,大部分低于15分。这样的性能允许预测新创建问题的参数,为学生训练编写模拟测试,并以极好的精度分析其性能,免去了昂贵的项目校准预测试步骤的需要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信