Alexis Burgon, Yuhang Zhang, Nicholas Petrick, Berkman Sahiner, Kenny H Cha, Ravi K Samala
{"title":"Bias amplification to facilitate the systematic evaluation of bias mitigation methods.","authors":"Alexis Burgon, Yuhang Zhang, Nicholas Petrick, Berkman Sahiner, Kenny H Cha, Ravi K Samala","doi":"10.1109/JBHI.2024.3491946","DOIUrl":null,"url":null,"abstract":"<p><p>The future of artificial intelligence (AI) safety is expected to include bias mitigation methods from development to application. The complexity and integration of these methods could grow in conjunction with advances in AI and human-AI interactions. Numerous methods are being proposed to mitigate bias, but without a structured way to compare their strengths and weaknesses. In this work, we present two approaches to systematically amplify subgroup performance bias. These approaches allow for the evaluation and comparison of the effectiveness of bias mitigation methods on AI models by varying the degrees of bias, and can be applied to any classification model. We used these approaches to compare four off-the-shelf bias mitigation methods. Both amplification approaches promote the development of learning shortcuts in which the model forms associations between patient attributes and AI output. We demonstrate these approaches in a case study, evaluating bias in the determination of COVID status from chest x-rays. The maximum achieved increase in performance bias, measured as a difference in predicted prevalence, was 72% and 32% for bias between subgroups related to patient sex and race, respectively. These changes in predicted prevalence were not accompanied by substantial changes in the differences in subgroup area under the receiver operating characteristic curves, indicating that the increased bias is due to the formation of learning shortcuts, not a difference in ability to distinguish positive and negative patients between subgroups.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2024.3491946","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The future of artificial intelligence (AI) safety is expected to include bias mitigation methods from development to application. The complexity and integration of these methods could grow in conjunction with advances in AI and human-AI interactions. Numerous methods are being proposed to mitigate bias, but without a structured way to compare their strengths and weaknesses. In this work, we present two approaches to systematically amplify subgroup performance bias. These approaches allow for the evaluation and comparison of the effectiveness of bias mitigation methods on AI models by varying the degrees of bias, and can be applied to any classification model. We used these approaches to compare four off-the-shelf bias mitigation methods. Both amplification approaches promote the development of learning shortcuts in which the model forms associations between patient attributes and AI output. We demonstrate these approaches in a case study, evaluating bias in the determination of COVID status from chest x-rays. The maximum achieved increase in performance bias, measured as a difference in predicted prevalence, was 72% and 32% for bias between subgroups related to patient sex and race, respectively. These changes in predicted prevalence were not accompanied by substantial changes in the differences in subgroup area under the receiver operating characteristic curves, indicating that the increased bias is due to the formation of learning shortcuts, not a difference in ability to distinguish positive and negative patients between subgroups.
期刊介绍:
IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.