Using a Generalized Logistic Regression Method to Detect Differential Item Functioning With Multiple Groups in Cognitive Diagnostic Tests.

IF 1.2 4区心理学 Q4 PSYCHOLOGY, MATHEMATICAL

Applied Psychological Measurement Pub Date : 2023-06-01 Epub Date: 2023-05-13 DOI:10.1177/01466216231174559

Xiaojian Sun, Shimeng Wang, Lei Guo, Tao Xin, Naiqing Song

{"title":"Using a Generalized Logistic Regression Method to Detect Differential Item Functioning With Multiple Groups in Cognitive Diagnostic Tests.","authors":"Xiaojian Sun, Shimeng Wang, Lei Guo, Tao Xin, Naiqing Song","doi":"10.1177/01466216231174559","DOIUrl":null,"url":null,"abstract":"<p><p>Items with the presence of differential item functioning (DIF) will compromise the validity and fairness of a test. Studies have investigated the DIF effect in the context of cognitive diagnostic assessment (CDA), and some DIF detection methods have been proposed. Most of these methods are mainly designed to perform the presence of DIF between two groups; however, empirical situations may contain more than two groups. To date, only a handful of studies have detected the DIF effect with multiple groups in the CDA context. This study uses the generalized logistic regression (GLR) method to detect DIF items by using the estimated attribute profile as matching criteria. A simulation study is conducted to examine the performance of the two GLR methods, GLR-based Wald test (GLR-Wald) and GLR-based likelihood ratio test (GLR-LRT), in detecting the DIF items, the results based on the ordinary Wald test are also reported. Results show that (1) both GLR-Wald and GLR-LRT have more reasonable performance in controlling Type I error rates than the ordinary Wald test in most conditions; (2) the GLR method also produces higher empirical rejection rates than the ordinary Wald test in most conditions; and (3) using the estimated attribute profile as the matching criteria can produce similar Type I error rates and empirical rejection rates for GLR-Wald and GLR-LRT. A real data example is also analyzed to illustrate the application of these DIF detection methods in multiple groups.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"47 4","pages":"328-346"},"PeriodicalIF":1.2000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10240570/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216231174559","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/5/13 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Items with the presence of differential item functioning (DIF) will compromise the validity and fairness of a test. Studies have investigated the DIF effect in the context of cognitive diagnostic assessment (CDA), and some DIF detection methods have been proposed. Most of these methods are mainly designed to perform the presence of DIF between two groups; however, empirical situations may contain more than two groups. To date, only a handful of studies have detected the DIF effect with multiple groups in the CDA context. This study uses the generalized logistic regression (GLR) method to detect DIF items by using the estimated attribute profile as matching criteria. A simulation study is conducted to examine the performance of the two GLR methods, GLR-based Wald test (GLR-Wald) and GLR-based likelihood ratio test (GLR-LRT), in detecting the DIF items, the results based on the ordinary Wald test are also reported. Results show that (1) both GLR-Wald and GLR-LRT have more reasonable performance in controlling Type I error rates than the ordinary Wald test in most conditions; (2) the GLR method also produces higher empirical rejection rates than the ordinary Wald test in most conditions; and (3) using the estimated attribute profile as the matching criteria can produce similar Type I error rates and empirical rejection rates for GLR-Wald and GLR-LRT. A real data example is also analyzed to illustrate the application of these DIF detection methods in multiple groups.

查看原文本刊更多论文

使用广义逻辑回归法检测认知诊断测试中多个组别的差异项目功能。

存在差异项目功能（DIF）的项目会影响测验的有效性和公平性。已有研究对认知诊断评估（CDA）中的 DIF 效应进行了调查，并提出了一些 DIF 检测方法。这些方法大多主要用于检测两组之间是否存在 DIF，但实际情况可能包含两组以上。迄今为止，只有少数研究在 CDA 情景下检测了多组的 DIF 效应。本研究使用广义逻辑回归（GLR）方法，将估计的属性特征作为匹配标准来检测 DIF 项目。通过模拟研究，考察了两种 GLR 方法（基于 GLR 的 Wald 检验（GLR-Wald）和基于 GLR 的似然比检验（GLR-LRT））在检测 DIF 项目时的性能，同时还报告了基于普通 Wald 检验的结果。结果表明：(1) 在大多数情况下，GLR-Wald 和 GLR-LRT 在控制 I 类错误率方面都比普通 Wald 检验有更合理的表现；(2) 在大多数情况下，GLR 方法也比普通 Wald 检验产生更高的经验拒绝率；(3) 使用估计的属性轮廓作为匹配标准可以使 GLR-Wald 和 GLR-LRT 产生相似的 I 类错误率和经验拒绝率。我们还分析了一个真实数据示例，以说明这些 DIF 检测方法在多组中的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Psychological Measurement Multiple-

CiteScore

2.30

自引率

8.30%

发文量

期刊介绍： Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.