{"title":"缩短测试时间是可行的:评估大规模多阶段适应性英语语言评估","authors":"Shangchao Min, Kyoungwon Bishop","doi":"10.1177/02655322231225426","DOIUrl":null,"url":null,"abstract":"This paper evaluates the multistage adaptive test (MST) design of a large-scale academic language assessment (ACCESS) for Grades 1–12, with an aim to simplify the current MST design, using both operational and simulated test data. Study 1 explored the operational population data (1,456,287 test-takers) of the listening and reading tests of MST ACCESS in the 2018–2019 school year to evaluate the MST design in terms of measurement efficiency and precision. Study 2 is a simulation study conducted to find an optimal MST design with manipulation on the number of items per stage and panel structure. The results from operational test data showed that the test length for both the listening and reading tests could be shortened to six folders (i.e., 18 items), with final ability estimates and reliability coefficients comparable to those of the current test, with slight differences. The simulation study showed that all six proposed MST designs yielded slightly better measurement accuracy and efficiency than the current design, among which the 1-3-3 MST design with more items at earlier stages ranked first. The findings of this study provide implications for the evaluation of MST designs and ways to optimize MST designs in language assessment.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":2.2000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A shortened test is feasible: Evaluating a large-scale multistage adaptive English language assessment\",\"authors\":\"Shangchao Min, Kyoungwon Bishop\",\"doi\":\"10.1177/02655322231225426\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper evaluates the multistage adaptive test (MST) design of a large-scale academic language assessment (ACCESS) for Grades 1–12, with an aim to simplify the current MST design, using both operational and simulated test data. Study 1 explored the operational population data (1,456,287 test-takers) of the listening and reading tests of MST ACCESS in the 2018–2019 school year to evaluate the MST design in terms of measurement efficiency and precision. Study 2 is a simulation study conducted to find an optimal MST design with manipulation on the number of items per stage and panel structure. The results from operational test data showed that the test length for both the listening and reading tests could be shortened to six folders (i.e., 18 items), with final ability estimates and reliability coefficients comparable to those of the current test, with slight differences. The simulation study showed that all six proposed MST designs yielded slightly better measurement accuracy and efficiency than the current design, among which the 1-3-3 MST design with more items at earlier stages ranked first. The findings of this study provide implications for the evaluation of MST designs and ways to optimize MST designs in language assessment.\",\"PeriodicalId\":17928,\"journal\":{\"name\":\"Language Testing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Language Testing\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1177/02655322231225426\",\"RegionNum\":1,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"N/A\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language Testing","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1177/02655322231225426","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"N/A","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
A shortened test is feasible: Evaluating a large-scale multistage adaptive English language assessment
This paper evaluates the multistage adaptive test (MST) design of a large-scale academic language assessment (ACCESS) for Grades 1–12, with an aim to simplify the current MST design, using both operational and simulated test data. Study 1 explored the operational population data (1,456,287 test-takers) of the listening and reading tests of MST ACCESS in the 2018–2019 school year to evaluate the MST design in terms of measurement efficiency and precision. Study 2 is a simulation study conducted to find an optimal MST design with manipulation on the number of items per stage and panel structure. The results from operational test data showed that the test length for both the listening and reading tests could be shortened to six folders (i.e., 18 items), with final ability estimates and reliability coefficients comparable to those of the current test, with slight differences. The simulation study showed that all six proposed MST designs yielded slightly better measurement accuracy and efficiency than the current design, among which the 1-3-3 MST design with more items at earlier stages ranked first. The findings of this study provide implications for the evaluation of MST designs and ways to optimize MST designs in language assessment.
期刊介绍:
Language Testing is a fully peer reviewed international journal that publishes original research and review articles on language testing and assessment. It provides a forum for the exchange of ideas and information between people working in the fields of first and second language testing and assessment. This includes researchers and practitioners in EFL and ESL testing, and assessment in child language acquisition and language pathology. In addition, special attention is focused on issues of testing theory, experimental investigations, and the following up of practical implications.