{"title":"适度项目校准的交叉分类随机效应模型","authors":"Seungwon Chung, Li Cai","doi":"10.3102/1076998620983908","DOIUrl":null,"url":null,"abstract":"In the research reported here, we propose a new method for scale alignment and test scoring in the context of supporting students with disabilities. In educational assessment, students from these special populations take modified tests because of a demonstrated disability that requires more assistance than standard testing accommodation. Updated federal education legislation and guidance require that these students be assessed and included in state education accountability systems, and their achievement reported with respect to the same rigorous content and achievement standards that the state adopted. Routine item calibration and linking methods are not feasible because the size of these special populations tends to be small. We develop a unified cross-classified random effects model that utilizes item response data from the general population as well as judge-provided data from subject matter experts in order to obtain revised item parameter estimates for use in scoring modified tests. We extend the Metropolis–Hastings Robbins–Monro algorithm to estimate the parameters of this model. The proposed method is applied to Braille test forms in a large operational multistate English language proficiency assessment program. Our work not only allows a broader range of modifications that is routinely considered in large-scale educational assessments but also directly incorporates the input from subject matter experts who work directly with the students needing support. Their structured and informed feedback deserves more attention from the psychometric community.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"651 - 681"},"PeriodicalIF":1.9000,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Cross-Classified Random Effects Modeling for Moderated Item Calibration\",\"authors\":\"Seungwon Chung, Li Cai\",\"doi\":\"10.3102/1076998620983908\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the research reported here, we propose a new method for scale alignment and test scoring in the context of supporting students with disabilities. In educational assessment, students from these special populations take modified tests because of a demonstrated disability that requires more assistance than standard testing accommodation. Updated federal education legislation and guidance require that these students be assessed and included in state education accountability systems, and their achievement reported with respect to the same rigorous content and achievement standards that the state adopted. Routine item calibration and linking methods are not feasible because the size of these special populations tends to be small. We develop a unified cross-classified random effects model that utilizes item response data from the general population as well as judge-provided data from subject matter experts in order to obtain revised item parameter estimates for use in scoring modified tests. We extend the Metropolis–Hastings Robbins–Monro algorithm to estimate the parameters of this model. The proposed method is applied to Braille test forms in a large operational multistate English language proficiency assessment program. Our work not only allows a broader range of modifications that is routinely considered in large-scale educational assessments but also directly incorporates the input from subject matter experts who work directly with the students needing support. Their structured and informed feedback deserves more attention from the psychometric community.\",\"PeriodicalId\":48001,\"journal\":{\"name\":\"Journal of Educational and Behavioral Statistics\",\"volume\":\"46 1\",\"pages\":\"651 - 681\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2021-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Educational and Behavioral Statistics\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.3102/1076998620983908\",\"RegionNum\":3,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Educational and Behavioral Statistics","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3102/1076998620983908","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
Cross-Classified Random Effects Modeling for Moderated Item Calibration
In the research reported here, we propose a new method for scale alignment and test scoring in the context of supporting students with disabilities. In educational assessment, students from these special populations take modified tests because of a demonstrated disability that requires more assistance than standard testing accommodation. Updated federal education legislation and guidance require that these students be assessed and included in state education accountability systems, and their achievement reported with respect to the same rigorous content and achievement standards that the state adopted. Routine item calibration and linking methods are not feasible because the size of these special populations tends to be small. We develop a unified cross-classified random effects model that utilizes item response data from the general population as well as judge-provided data from subject matter experts in order to obtain revised item parameter estimates for use in scoring modified tests. We extend the Metropolis–Hastings Robbins–Monro algorithm to estimate the parameters of this model. The proposed method is applied to Braille test forms in a large operational multistate English language proficiency assessment program. Our work not only allows a broader range of modifications that is routinely considered in large-scale educational assessments but also directly incorporates the input from subject matter experts who work directly with the students needing support. Their structured and informed feedback deserves more attention from the psychometric community.
期刊介绍:
Journal of Educational and Behavioral Statistics, sponsored jointly by the American Educational Research Association and the American Statistical Association, publishes articles that are original and provide methods that are useful to those studying problems and issues in educational or behavioral research. Typical papers introduce new methods of analysis. Critical reviews of current practice, tutorial presentations of less well known methods, and novel applications of already-known methods are also of interest. Papers discussing statistical techniques without specific educational or behavioral interest or focusing on substantive results without developing new statistical methods or models or making novel use of existing methods have lower priority. Simulation studies, either to demonstrate properties of an existing method or to compare several existing methods (without providing a new method), also have low priority. The Journal of Educational and Behavioral Statistics provides an outlet for papers that are original and provide methods that are useful to those studying problems and issues in educational or behavioral research. Typical papers introduce new methods of analysis, provide properties of these methods, and an example of use in education or behavioral research. Critical reviews of current practice, tutorial presentations of less well known methods, and novel applications of already-known methods are also sometimes accepted. Papers discussing statistical techniques without specific educational or behavioral interest or focusing on substantive results without developing new statistical methods or models or making novel use of existing methods have lower priority. Simulation studies, either to demonstrate properties of an existing method or to compare several existing methods (without providing a new method), also have low priority.