结合倾向分数和测试分数相等的常见项目。

IF 1.2 4区心理学 Q4 PSYCHOLOGY, MATHEMATICAL

Applied Psychological Measurement Pub Date : 2025-07-30 DOI:10.1177/01466216251363240

Inga Laukaityte, Gabriel Wallin, Marie Wiberg

{"title":"结合倾向分数和测试分数相等的常见项目。","authors":"Inga Laukaityte, Gabriel Wallin, Marie Wiberg","doi":"10.1177/01466216251363240","DOIUrl":null,"url":null,"abstract":"Ensuring that test scores are fair and comparable across different test forms and different test groups is a significant statistical challenge in educational testing. Methods to achieve score comparability, a process known as test score equating, often rely on including common test items or assuming that test taker groups are similar in key characteristics. This study explores a novel approach that combines propensity scores, based on test takers' background covariates, with information from common items using kernel smoothing techniques for binary-scored test items. An empirical analysis using data from a high-stakes college admissions test evaluates the standard errors and differences in adjusted test scores. A simulation study examines the impact of factors such as the number of test takers, the number of common items, and the correlation between covariates and test scores on the method's performance. The findings demonstrate that integrating propensity scores with common item information reduces standard errors and bias more effectively than using either source alone. This suggests that balancing the groups on the test-takers' covariates enhance the fairness and accuracy of test score comparisons across different groups. The proposed method highlights the benefits of considering all the collected data to improve score comparability.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216251363240"},"PeriodicalIF":1.2000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310624/pdf/","citationCount":"0","resultStr":"{\"title\":\"Combining Propensity Scores and Common Items for Test Score Equating.\",\"authors\":\"Inga Laukaityte, Gabriel Wallin, Marie Wiberg\",\"doi\":\"10.1177/01466216251363240\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ensuring that test scores are fair and comparable across different test forms and different test groups is a significant statistical challenge in educational testing. Methods to achieve score comparability, a process known as test score equating, often rely on including common test items or assuming that test taker groups are similar in key characteristics. This study explores a novel approach that combines propensity scores, based on test takers' background covariates, with information from common items using kernel smoothing techniques for binary-scored test items. An empirical analysis using data from a high-stakes college admissions test evaluates the standard errors and differences in adjusted test scores. A simulation study examines the impact of factors such as the number of test takers, the number of common items, and the correlation between covariates and test scores on the method's performance. The findings demonstrate that integrating propensity scores with common item information reduces standard errors and bias more effectively than using either source alone. This suggests that balancing the groups on the test-takers' covariates enhance the fairness and accuracy of test score comparisons across different groups. The proposed method highlights the benefits of considering all the collected data to improve score comparability.\",\"PeriodicalId\":48300,\"journal\":{\"name\":\"Applied Psychological Measurement\",\"volume\":\" \",\"pages\":\"01466216251363240\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2025-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310624/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Psychological Measurement\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1177/01466216251363240\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"PSYCHOLOGY, MATHEMATICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216251363240","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}

引用次数: 0

摘要

在教育考试中，确保不同考试形式和不同考试群体的考试成绩公平和可比性是一项重大的统计挑战。实现分数可比性的方法，一个被称为考试分数相等的过程，通常依赖于包括共同的测试项目或假设考生群体在关键特征上相似。本研究探索了一种新颖的方法，将基于考生背景协变量的倾向分数与使用核平滑技术处理二元得分测试项目的常见项目信息相结合。一项利用高风险大学入学考试数据的实证分析评估了标准误差和调整后考试成绩的差异。一项模拟研究考察了一些因素的影响，如参加考试的人数、常见项目的数量、协变量和考试分数之间的相关性对方法性能的影响。研究结果表明，与单独使用任何一种来源相比，将倾向得分与常见项目信息相结合可以更有效地减少标准误差和偏差。这表明，在考生协变量上平衡各组可以提高不同组间考试成绩比较的公平性和准确性。所提出的方法强调了考虑所有收集数据以提高分数可比性的好处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Combining Propensity Scores and Common Items for Test Score Equating.

Ensuring that test scores are fair and comparable across different test forms and different test groups is a significant statistical challenge in educational testing. Methods to achieve score comparability, a process known as test score equating, often rely on including common test items or assuming that test taker groups are similar in key characteristics. This study explores a novel approach that combines propensity scores, based on test takers' background covariates, with information from common items using kernel smoothing techniques for binary-scored test items. An empirical analysis using data from a high-stakes college admissions test evaluates the standard errors and differences in adjusted test scores. A simulation study examines the impact of factors such as the number of test takers, the number of common items, and the correlation between covariates and test scores on the method's performance. The findings demonstrate that integrating propensity scores with common item information reduces standard errors and bias more effectively than using either source alone. This suggests that balancing the groups on the test-takers' covariates enhance the fairness and accuracy of test score comparisons across different groups. The proposed method highlights the benefits of considering all the collected data to improve score comparability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Psychological Measurement Multiple-

CiteScore

2.30

自引率

8.30%

发文量

期刊介绍： Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.