Methods for the Comparison of Differential Item Functioning across Assessments.

Journal of applied measurement Pub Date : 2018-01-01
W Holmes Finch, Maria Hernandez Finch, Brian F French, David E McIntosh, Lauren Moss
{"title":"Methods for the Comparison of Differential Item Functioning across Assessments.","authors":"W Holmes Finch,&nbsp;Maria Hernandez Finch,&nbsp;Brian F French,&nbsp;David E McIntosh,&nbsp;Lauren Moss","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>An important aspect of the educational and psychological evaluation of individuals is the selection of scales with appropriate evidence of reliability and validity for inferences and uses of the scores for the population of interest. One key aspect of validity is the degree to which a scale fairly assesses the construct(s) of interest for members of different subgroups within the population. Typically, this issue is addressed statistically through assessment of differential item functioning (DIF) of individual items, or differential test functioning (DTF) of sets of items within the same measure. When selecting an assessment to use for a given application (e.g., measuring intelligence), or which form of an assessment to use for a test administration, researchers need to consider the extent to which the scales work with all members of the population. Little research has examined methods for comparing the amount or magnitude of DIF/DTF present in two or more assessments when deciding which assessment to use. The current study made use of 7 different statistics for this purpose, in the context of intelligence testing. Results demonstrate that by using a variety of effect sizes, the researcher can gain insights into not only which scales may contain the least amount of DTF, but also how they differ with regard to the way in which the DTF manifests itself.</p>","PeriodicalId":73608,"journal":{"name":"Journal of applied measurement","volume":"19 1","pages":"26-40"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of applied measurement","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

An important aspect of the educational and psychological evaluation of individuals is the selection of scales with appropriate evidence of reliability and validity for inferences and uses of the scores for the population of interest. One key aspect of validity is the degree to which a scale fairly assesses the construct(s) of interest for members of different subgroups within the population. Typically, this issue is addressed statistically through assessment of differential item functioning (DIF) of individual items, or differential test functioning (DTF) of sets of items within the same measure. When selecting an assessment to use for a given application (e.g., measuring intelligence), or which form of an assessment to use for a test administration, researchers need to consider the extent to which the scales work with all members of the population. Little research has examined methods for comparing the amount or magnitude of DIF/DTF present in two or more assessments when deciding which assessment to use. The current study made use of 7 different statistics for this purpose, in the context of intelligence testing. Results demonstrate that by using a variety of effect sizes, the researcher can gain insights into not only which scales may contain the least amount of DTF, but also how they differ with regard to the way in which the DTF manifests itself.

跨评估的差异项目功能比较方法。
对个人进行教育和心理评估的一个重要方面是选择具有适当证据的量表,以证明其推断的可靠性和有效性,并将分数用于感兴趣的人口。效度的一个关键方面是量表公平评估人群中不同亚组成员感兴趣的构念的程度。通常,这个问题是通过评估单个项目的差异项目功能(DIF)或同一测量中项目集的差异测试功能(DTF)来统计地解决的。当选择一个评估用于一个给定的应用程序(例如,测量智力),或哪一种形式的评估用于测试管理,研究人员需要考虑的程度,量表适用于所有成员的人口。很少有研究审查在决定使用哪种评估时比较两个或多个评估中存在的DIF/DTF的数量或大小的方法。目前的研究在智力测试的背景下为此目的使用了7种不同的统计数据。结果表明,通过使用各种效应大小,研究人员不仅可以深入了解哪些量表可能包含最少的DTF,而且还可以了解它们在DTF表现方式方面的差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信