{"title":"Effect Sizes for Estimating Differential Item Functioning Influence at the Test Level","authors":"W. H. Finch, B. French","doi":"10.3390/psych5010013","DOIUrl":null,"url":null,"abstract":"Differential item functioning (DIF) is a critical step in providing evidence to support a scoring inference in building a validity argument for a psychological or educational assessment. Effect sizes can assist in understanding the accumulation of DIF at the test score level. The current simulation study investigated the performance of several proposed effect size measures under a variety of conditions. Conditions under study included varied sample sizes, DIF effect sizes, the proportion of items with DIF, and the type of DIF (additive vs. non-additive). DIF effect sizes under study included sDTF%, uDTF%, τw2, d, RΔ2, IDIF2*, and S-DIF-V. The results of this study suggest that across study conditions, τw2, IDIF2*, and d were consistently the most accurate measures of the DIF effects. The effect sizes were also estimated in an empirical example. Recommendations and implications for practice are discussed.","PeriodicalId":93139,"journal":{"name":"Psych","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psych","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/psych5010013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Differential item functioning (DIF) is a critical step in providing evidence to support a scoring inference in building a validity argument for a psychological or educational assessment. Effect sizes can assist in understanding the accumulation of DIF at the test score level. The current simulation study investigated the performance of several proposed effect size measures under a variety of conditions. Conditions under study included varied sample sizes, DIF effect sizes, the proportion of items with DIF, and the type of DIF (additive vs. non-additive). DIF effect sizes under study included sDTF%, uDTF%, τw2, d, RΔ2, IDIF2*, and S-DIF-V. The results of this study suggest that across study conditions, τw2, IDIF2*, and d were consistently the most accurate measures of the DIF effects. The effect sizes were also estimated in an empirical example. Recommendations and implications for practice are discussed.