Accuracy of progress monitoring decision rules to evaluate response to instruction with two computer adaptive tests

IF 3.8 1区 心理学 Q1 PSYCHOLOGY, SOCIAL
Ethan R. Van Norman, Emily R. Forcht
{"title":"Accuracy of progress monitoring decision rules to evaluate response to instruction with two computer adaptive tests","authors":"Ethan R. Van Norman,&nbsp;Emily R. Forcht","doi":"10.1016/j.jsp.2024.101319","DOIUrl":null,"url":null,"abstract":"<div><p>Computer adaptive tests have become popular assessments to screen students for academic risk. Research is emerging regarding their use as progress monitoring tools to measure response to instruction. We evaluated the accuracy of the trend-line decision rule when applied to outcomes from a frequently used reading computer adaptive test (i.e., Star Reading [SR]) and frequently used math computer adaptive test (i.e., Star Math [SM]). Analyses of extant SR and SM data were conducted to inform conditions for simulations to determine the number of assessments required to yield sufficient sensitivity (i.e., probability of recommending an instructional change when a change was warranted) and specificity (i.e., probability of recommending maintaining an intervention when a change was not warranted) when comparing performance to goal lines based upon a future target score (i.e., benchmark) as well as normative comparisons (50th and 75th percentiles). The extant dataset of SR outcomes consisted of monthly progress monitoring data from 993 Grade 3, 804 Grade 4, and 709 Grade 5 students from multiple states in the United States northwest. Data for SM were also drawn from the northwest and contained outcomes from 518 Grade 3, 474 Grade 4, and 391 Grade 5 students. Grade level samples were predominately White (range = 59.89%–67.72%) followed by Latinx (range = 9.65%–15.94%). Results of simulations suggest that when data were collected once a month, seven, eight, and nine observations were required to support low-stakes decisions with SR for Grades 3, 4, and 5, respectively. For SM, nine, ten, and eight observations were required for Grades, 3, 4, and 5, respectively. Given the length of time required to support reasonably accurate decisions, recommendations to consider other types of assessments and decision-making frameworks for academic progress monitoring are provided.</p></div>","PeriodicalId":48232,"journal":{"name":"Journal of School Psychology","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of School Psychology","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022440524000396","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, SOCIAL","Score":null,"Total":0}
引用次数: 0

Abstract

Computer adaptive tests have become popular assessments to screen students for academic risk. Research is emerging regarding their use as progress monitoring tools to measure response to instruction. We evaluated the accuracy of the trend-line decision rule when applied to outcomes from a frequently used reading computer adaptive test (i.e., Star Reading [SR]) and frequently used math computer adaptive test (i.e., Star Math [SM]). Analyses of extant SR and SM data were conducted to inform conditions for simulations to determine the number of assessments required to yield sufficient sensitivity (i.e., probability of recommending an instructional change when a change was warranted) and specificity (i.e., probability of recommending maintaining an intervention when a change was not warranted) when comparing performance to goal lines based upon a future target score (i.e., benchmark) as well as normative comparisons (50th and 75th percentiles). The extant dataset of SR outcomes consisted of monthly progress monitoring data from 993 Grade 3, 804 Grade 4, and 709 Grade 5 students from multiple states in the United States northwest. Data for SM were also drawn from the northwest and contained outcomes from 518 Grade 3, 474 Grade 4, and 391 Grade 5 students. Grade level samples were predominately White (range = 59.89%–67.72%) followed by Latinx (range = 9.65%–15.94%). Results of simulations suggest that when data were collected once a month, seven, eight, and nine observations were required to support low-stakes decisions with SR for Grades 3, 4, and 5, respectively. For SM, nine, ten, and eight observations were required for Grades, 3, 4, and 5, respectively. Given the length of time required to support reasonably accurate decisions, recommendations to consider other types of assessments and decision-making frameworks for academic progress monitoring are provided.

使用两种计算机自适应测试评估教学反应的进度监测决策规则的准确性
计算机自适应测试已成为筛查学生学业风险的流行评估方法。有关将计算机自适应测试作为进度监测工具来衡量教学反应的研究也在不断涌现。我们评估了趋势线判定规则应用于常用阅读计算机自适应测试(即 "Star Reading"[SR])和常用数学计算机自适应测试(即 "Star Math"[SM])结果时的准确性。对现有的 SR 和 SM 数据进行了分析,以确定模拟条件,从而确定在根据未来目标分数(即基准)以及常模比较(第 50 和第 75 百分位数)将成绩与目标线进行比较时,需要多少次评估才能产生足够的灵敏度(即在需要改变时建议改变教学的概率)和特异性(即在不需要改变时建议维持干预的概率)。现有的 SR 成果数据集包括来自美国西北部多个州的 993 名三年级、804 名四年级和 709 名五年级学生的每月进度监测数据。SM的数据也来自美国西北部,包括518名三年级学生、474名四年级学生和391名五年级学生的成绩。年级样本主要是白人(范围=59.89%-67.72%),其次是拉丁裔(范围=9.65%-15.94%)。模拟结果表明,在每月收集一次数据的情况下,三年级、四年级和五年级分别需要七次、八次和九次观察来支持SR的低风险决策。对于 SM,三年级、四年级和五年级分别需要 9 次、10 次和 8 次观察。鉴于支持合理准确的决定所需的时间较长,建议考虑其他类型的评估和学业进展监测决策框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of School Psychology
Journal of School Psychology PSYCHOLOGY, EDUCATIONAL-
CiteScore
6.70
自引率
8.00%
发文量
71
期刊介绍: The Journal of School Psychology publishes original empirical articles and critical reviews of the literature on research and practices relevant to psychological and behavioral processes in school settings. JSP presents research on intervention mechanisms and approaches; schooling effects on the development of social, cognitive, mental-health, and achievement-related outcomes; assessment; and consultation. Submissions from a variety of disciplines are encouraged. All manuscripts are read by the Editor and one or more editorial consultants with the intent of providing appropriate and constructive written reviews.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信