{"title":"Exploring the Impact of Deleting (or Retaining) a Biased Item: A Procedure Based on Classification Accuracy.","authors":"Meltem Ozcan, Mark H C Lai","doi":"10.1177/10731911241298081","DOIUrl":null,"url":null,"abstract":"<p><p>Psychological test scores are commonly used in high-stakes settings to classify individuals. While measurement invariance across groups is necessary for valid and meaningful inferences of group differences, full measurement invariance rarely holds in practice. The classification accuracy analysis framework aims to quantify the degree and practical impact of noninvariance. However, how to best navigate the next steps remains unclear, and methods devised to account for noninvariance at the group level may be insufficient when the goal is classification. Furthermore, deleting a biased item may improve fairness but negatively affect performance, and replacing the test can be costly. We propose item-level effect size indices that allow test users to make more informed decisions by quantifying the impact of deleting (or retaining) an item on test performance and fairness, provide an illustrative example, and introduce <i>unbiasr</i>, an R package implementing the proposed methods.</p>","PeriodicalId":8577,"journal":{"name":"Assessment","volume":" ","pages":"10731911241298081"},"PeriodicalIF":3.5000,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assessment","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/10731911241298081","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Psychological test scores are commonly used in high-stakes settings to classify individuals. While measurement invariance across groups is necessary for valid and meaningful inferences of group differences, full measurement invariance rarely holds in practice. The classification accuracy analysis framework aims to quantify the degree and practical impact of noninvariance. However, how to best navigate the next steps remains unclear, and methods devised to account for noninvariance at the group level may be insufficient when the goal is classification. Furthermore, deleting a biased item may improve fairness but negatively affect performance, and replacing the test can be costly. We propose item-level effect size indices that allow test users to make more informed decisions by quantifying the impact of deleting (or retaining) an item on test performance and fairness, provide an illustrative example, and introduce unbiasr, an R package implementing the proposed methods.
期刊介绍:
Assessment publishes articles in the domain of applied clinical assessment. The emphasis of this journal is on publication of information of relevance to the use of assessment measures, including test development, validation, and interpretation practices. The scope of the journal includes research that can inform assessment practices in mental health, forensic, medical, and other applied settings. Papers that focus on the assessment of cognitive and neuropsychological functioning, personality, and psychopathology are invited. Most papers published in Assessment report the results of original empirical research, however integrative review articles and scholarly case studies will also be considered.