Detecting Test-Taking Engagement in Changing Test Contexts

Q3 Social Sciences

ETS Research Report Series Pub Date : 2024-10-03 DOI:10.1002/ets2.12384

Blair Lehman, Jesse R. Sparks, Jonathan Steinberg

{"title":"Detecting Test-Taking Engagement in Changing Test Contexts","authors":"Blair Lehman, Jesse R. Sparks, Jonathan Steinberg","doi":"10.1002/ets2.12384","DOIUrl":null,"url":null,"abstract":"<p>Over the last 20 years, many methods have been proposed to use process data (e.g., response time) to detect changes in engagement during the test-taking process. However, many of these methods were developed and evaluated in highly similar testing contexts: 30 or more single-select multiple-choice items presented in a linear, fixed sequence in which an item must be answered before progressing to the next item. However, this testing context becomes less and less representative of testing contexts in general as the affordances of technology are leveraged to provide more diverse and innovative testing experiences. The 2019 National Assessment of Educational Progress (NAEP) mathematics administration for grades 8 and 12 testing context represents an example use case that differed significantly from assessments that were typically used in previous research on test-taking engagement (e.g., number of items, item format, navigation). Thus, we leveraged this use case to re-evaluate the utility of an existing engagement detection method: normative threshold method. We decomposed the normative threshold method to evaluate its alignment with this use case and then evaluated 25 variations of this threshold-setting method with previously established evaluation criteria. Our findings revealed that this critical analysis of the threshold-setting method's alignment with the NAEP testing context could be used to identify the most appropriate variation of this method for this use case. We discuss the broader implications for engagement detection as testing contexts continue to evolve.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2024 1","pages":"1-15"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12384","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ETS Research Report Series","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ets2.12384","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

Over the last 20 years, many methods have been proposed to use process data (e.g., response time) to detect changes in engagement during the test-taking process. However, many of these methods were developed and evaluated in highly similar testing contexts: 30 or more single-select multiple-choice items presented in a linear, fixed sequence in which an item must be answered before progressing to the next item. However, this testing context becomes less and less representative of testing contexts in general as the affordances of technology are leveraged to provide more diverse and innovative testing experiences. The 2019 National Assessment of Educational Progress (NAEP) mathematics administration for grades 8 and 12 testing context represents an example use case that differed significantly from assessments that were typically used in previous research on test-taking engagement (e.g., number of items, item format, navigation). Thus, we leveraged this use case to re-evaluate the utility of an existing engagement detection method: normative threshold method. We decomposed the normative threshold method to evaluate its alignment with this use case and then evaluated 25 variations of this threshold-setting method with previously established evaluation criteria. Our findings revealed that this critical analysis of the threshold-setting method's alignment with the NAEP testing context could be used to identify the most appropriate variation of this method for this use case. We discuss the broader implications for engagement detection as testing contexts continue to evolve.

Abstract Image

查看原文本刊更多论文

在变化的考试环境中检测考生参与度

在过去的20年里，人们提出了许多方法来使用过程数据（例如，反应时间）来检测考试过程中参与度的变化。然而，这些方法中的许多都是在高度相似的测试环境中开发和评估的：30个或更多的单选题选择题以线性固定顺序呈现，其中必须回答一个问题才能进入下一个问题。然而，随着技术的支持被用来提供更加多样化和创新的测试体验，这种测试环境变得越来越不能代表一般的测试环境。2019年全国教育进步评估（NAEP） 8年级和12年级数学管理测试背景是一个示例用例，与之前关于考试参与度的研究中通常使用的评估（例如，项目数量、项目格式、导航）有很大不同。因此，我们利用这个用例来重新评估现有敬业度检测方法的效用：规范阈值方法。我们分解了规范的阈值方法，以评估它与这个用例的一致性，然后用先前建立的评估标准评估这个阈值设置方法的25个变体。我们的研究结果表明，阈值设置方法与NAEP测试上下文的一致性的关键分析可以用于确定该方法对该用例的最合适的变化。随着测试环境的不断发展，我们将讨论敬业度检测的更广泛含义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ETS Research Report Series Social Sciences-Education

CiteScore

1.20

自引率

0.00%

发文量