{"title":"Comparative Analysis of Psychometric Frameworks and Properties of Scores from Autogenerated Test Forms","authors":"Won-Chan Lee, Stella Y. Kim","doi":"10.1111/emip.12648","DOIUrl":null,"url":null,"abstract":"<p>This paper explores the psychometric properties of scores derived from autogenerated test forms by introducing three conceptual frameworks: Alternate Test Forms, Randomly Parallel Forms, and Approximately Parallel Forms. Each framework provides a distinct perspective on score comparability, definitions of true score and standard error of measurement (SEM), and the necessity of equating. Through a simulation study, we illustrate how these frameworks compare in terms of true scores and SEMs, while also assessing the impact of equating on score comparability across varying levels of form variability. Ultimately, this study seeks to lay the groundwork for implementing scoring practices in large-scale standardized assessments that use autogenerated forms.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 4","pages":"13-23"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12648","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational Measurement-Issues and Practice","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/emip.12648","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
This paper explores the psychometric properties of scores derived from autogenerated test forms by introducing three conceptual frameworks: Alternate Test Forms, Randomly Parallel Forms, and Approximately Parallel Forms. Each framework provides a distinct perspective on score comparability, definitions of true score and standard error of measurement (SEM), and the necessity of equating. Through a simulation study, we illustrate how these frameworks compare in terms of true scores and SEMs, while also assessing the impact of equating on score comparability across varying levels of form variability. Ultimately, this study seeks to lay the groundwork for implementing scoring practices in large-scale standardized assessments that use autogenerated forms.