{"title":"四种IRT模型在等价段落测试中的相对性能比较","authors":"Kyung Yong Kim, Euijin Lim, Won‐Chan Lee","doi":"10.1080/15305058.2018.1530239","DOIUrl":null,"url":null,"abstract":"For passage-based tests, items that belong to a common passage often violate the local independence assumption of unidimensional item response theory (UIRT). In this case, ignoring local item dependence (LID) and estimating item parameters using a UIRT model could be problematic because doing so might result in inaccurate parameter estimates, which, in turn, could impact the results of equating. Under the random groups design, the main purpose of this article was to compare the relative performance of the three-parameter logistic (3PL), graded response (GR), bifactor, and testlet models on equating passage-based tests when various degrees of LID were present due to passage. Simulation results showed that the testlet model produced the most accurate equating results, followed by the bifactor model. The 3PL model worked as well as the bifactor and testlet models when the degree of LID was low but returned less accurate equating results than the two multidimensional models as the degree of LID increased. Among the four models, the polytomous GR model provided the least accurate equating results.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2018-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1530239","citationCount":"3","resultStr":"{\"title\":\"A Comparison of the Relative Performance of Four IRT Models on Equating Passage-Based Tests\",\"authors\":\"Kyung Yong Kim, Euijin Lim, Won‐Chan Lee\",\"doi\":\"10.1080/15305058.2018.1530239\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For passage-based tests, items that belong to a common passage often violate the local independence assumption of unidimensional item response theory (UIRT). In this case, ignoring local item dependence (LID) and estimating item parameters using a UIRT model could be problematic because doing so might result in inaccurate parameter estimates, which, in turn, could impact the results of equating. Under the random groups design, the main purpose of this article was to compare the relative performance of the three-parameter logistic (3PL), graded response (GR), bifactor, and testlet models on equating passage-based tests when various degrees of LID were present due to passage. Simulation results showed that the testlet model produced the most accurate equating results, followed by the bifactor model. The 3PL model worked as well as the bifactor and testlet models when the degree of LID was low but returned less accurate equating results than the two multidimensional models as the degree of LID increased. Among the four models, the polytomous GR model provided the least accurate equating results.\",\"PeriodicalId\":46615,\"journal\":{\"name\":\"International Journal of Testing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2018-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1080/15305058.2018.1530239\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Testing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/15305058.2018.1530239\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SOCIAL SCIENCES, INTERDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Testing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/15305058.2018.1530239","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
A Comparison of the Relative Performance of Four IRT Models on Equating Passage-Based Tests
For passage-based tests, items that belong to a common passage often violate the local independence assumption of unidimensional item response theory (UIRT). In this case, ignoring local item dependence (LID) and estimating item parameters using a UIRT model could be problematic because doing so might result in inaccurate parameter estimates, which, in turn, could impact the results of equating. Under the random groups design, the main purpose of this article was to compare the relative performance of the three-parameter logistic (3PL), graded response (GR), bifactor, and testlet models on equating passage-based tests when various degrees of LID were present due to passage. Simulation results showed that the testlet model produced the most accurate equating results, followed by the bifactor model. The 3PL model worked as well as the bifactor and testlet models when the degree of LID was low but returned less accurate equating results than the two multidimensional models as the degree of LID increased. Among the four models, the polytomous GR model provided the least accurate equating results.