使用可解释机器学习检测心理测验中的差异项目功能

IF 16.4 1区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Accounts of Chemical Research Pub Date : 2024-03-11 DOI:10.1177/01466216241238744

E. Kraus, Johannes Wild, Sven Hilbert

{"title":"使用可解释机器学习检测心理测验中的差异项目功能","authors":"E. Kraus, Johannes Wild, Sven Hilbert","doi":"10.1177/01466216241238744","DOIUrl":null,"url":null,"abstract":"This study presents a novel method to investigate test fairness and differential item functioning combining psychometrics and machine learning. Test unfairness manifests itself in systematic and demographically imbalanced influences of confounding constructs on residual variances in psychometric modeling. Our method aims to account for resulting complex relationships between response patterns and demographic attributes. Specifically, it measures the importance of individual test items, and latent ability scores in comparison to a random baseline variable when predicting demographic characteristics. We conducted a simulation study to examine the functionality of our method under various conditions such as linear and complex impact, unfairness and varying number of factors, unfair items, and varying test length. We found that our method detects unfair items as reliably as Mantel–Haenszel statistics or logistic regression analyses but generalizes to multidimensional scales in a straight forward manner. To apply the method, we used random forests to predict migration backgrounds from ability scores and single items of an elementary school reading comprehension test. One item was found to be unfair according to all proposed decision criteria. Further analysis of the item’s content provided plausible explanations for this finding. Analysis code is available at: https://osf.io/s57rw/?view_only=47a3564028d64758982730c6d9c6c547 .","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":"11 3","pages":""},"PeriodicalIF":16.4000,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Interpretable Machine Learning for Differential Item Functioning Detection in Psychometric Tests\",\"authors\":\"E. Kraus, Johannes Wild, Sven Hilbert\",\"doi\":\"10.1177/01466216241238744\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study presents a novel method to investigate test fairness and differential item functioning combining psychometrics and machine learning. Test unfairness manifests itself in systematic and demographically imbalanced influences of confounding constructs on residual variances in psychometric modeling. Our method aims to account for resulting complex relationships between response patterns and demographic attributes. Specifically, it measures the importance of individual test items, and latent ability scores in comparison to a random baseline variable when predicting demographic characteristics. We conducted a simulation study to examine the functionality of our method under various conditions such as linear and complex impact, unfairness and varying number of factors, unfair items, and varying test length. We found that our method detects unfair items as reliably as Mantel–Haenszel statistics or logistic regression analyses but generalizes to multidimensional scales in a straight forward manner. To apply the method, we used random forests to predict migration backgrounds from ability scores and single items of an elementary school reading comprehension test. One item was found to be unfair according to all proposed decision criteria. Further analysis of the item’s content provided plausible explanations for this finding. Analysis code is available at: https://osf.io/s57rw/?view_only=47a3564028d64758982730c6d9c6c547 .\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":\"11 3\",\"pages\":\"\"},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2024-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1177/01466216241238744\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216241238744","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

本研究提出了一种结合心理测量学和机器学习研究测验公平性和差异项目功能的新方法。在心理测量建模中，测验不公平表现为混杂因素对残差的系统性和人口统计不平衡影响。我们的方法旨在解释由此产生的应答模式与人口统计学属性之间的复杂关系。具体来说，它衡量了单个测试项目和潜在能力分数与随机基线变量相比在预测人口统计学特征时的重要性。我们进行了一项模拟研究，以检验我们的方法在不同条件下的功能，如线性和复杂影响、不公平和不同因素数量、不公平项目和不同测试长度。我们发现，我们的方法能像曼特尔-海恩泽尔统计或逻辑回归分析一样可靠地检测出不公平项目，而且能直接推广到多维量表。为了应用该方法，我们使用随机森林从小学阅读理解测试的能力分数和单个项目中预测迁移背景。根据所有建议的判定标准，我们发现有一个项目是不公平的。对该项目内容的进一步分析为这一发现提供了合理的解释。分析代码见：https://osf.io/s57rw/?view_only=47a3564028d64758982730c6d9c6c547 。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using Interpretable Machine Learning for Differential Item Functioning Detection in Psychometric Tests

This study presents a novel method to investigate test fairness and differential item functioning combining psychometrics and machine learning. Test unfairness manifests itself in systematic and demographically imbalanced influences of confounding constructs on residual variances in psychometric modeling. Our method aims to account for resulting complex relationships between response patterns and demographic attributes. Specifically, it measures the importance of individual test items, and latent ability scores in comparison to a random baseline variable when predicting demographic characteristics. We conducted a simulation study to examine the functionality of our method under various conditions such as linear and complex impact, unfairness and varying number of factors, unfair items, and varying test length. We found that our method detects unfair items as reliably as Mantel–Haenszel statistics or logistic regression analyses but generalizes to multidimensional scales in a straight forward manner. To apply the method, we used random forests to predict migration backgrounds from ability scores and single items of an elementary school reading comprehension test. One item was found to be unfair according to all proposed decision criteria. Further analysis of the item’s content provided plausible explanations for this finding. Analysis code is available at: https://osf.io/s57rw/?view_only=47a3564028d64758982730c6d9c6c547 .

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Accounts of Chemical Research 化学-化学综合

CiteScore

31.40

自引率

1.10%

发文量

312

审稿时长

2 months

期刊介绍： Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.