Ensuring medical AI safety: interpretability-driven detection and mitigation of spurious model behavior and associated data.

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Pub Date : 2025-01-01 Epub Date: 2025-08-12 DOI:10.1007/s10994-025-06834-w

Frederik Pahde, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek

{"title":"Ensuring medical AI safety: interpretability-driven detection and mitigation of spurious model behavior and associated data.","authors":"Frederik Pahde, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek","doi":"10.1007/s10994-025-06834-w","DOIUrl":null,"url":null,"abstract":"<p><p>Deep neural networks are increasingly employed in high-stakes medical applications, despite their tendency for shortcut learning in the presence of spurious correlations, which can have potentially fatal consequences in practice. Whereas a multitude of works address either the detection or mitigation of such shortcut behavior in isolation, the Reveal2Revise approach provides a comprehensive bias mitigation framework combining these steps. However, effectively addressing these biases often requires substantial labeling efforts from domain experts. In this work, we review the steps of the Reveal2Revise framework and enhance it with semi-automated interpretability-based bias annotation capabilities. This includes methods for the sample- and feature-level bias annotation, providing valuable information for bias mitigation methods to unlearn the undesired shortcut behavior. We show the applicability of the framework using four medical datasets across two modalities, featuring controlled and real-world spurious correlations caused by data artifacts. We successfully identify and mitigate these biases in VGG16, ResNet50, and contemporary Vision Transformer models, ultimately increasing their robustness and applicability for real-world medical tasks. Our code is available at https://github.com/frederikpahde/medical-ai-safety.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 9","pages":"206"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12343733/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-025-06834-w","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/12 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep neural networks are increasingly employed in high-stakes medical applications, despite their tendency for shortcut learning in the presence of spurious correlations, which can have potentially fatal consequences in practice. Whereas a multitude of works address either the detection or mitigation of such shortcut behavior in isolation, the Reveal2Revise approach provides a comprehensive bias mitigation framework combining these steps. However, effectively addressing these biases often requires substantial labeling efforts from domain experts. In this work, we review the steps of the Reveal2Revise framework and enhance it with semi-automated interpretability-based bias annotation capabilities. This includes methods for the sample- and feature-level bias annotation, providing valuable information for bias mitigation methods to unlearn the undesired shortcut behavior. We show the applicability of the framework using four medical datasets across two modalities, featuring controlled and real-world spurious correlations caused by data artifacts. We successfully identify and mitigate these biases in VGG16, ResNet50, and contemporary Vision Transformer models, ultimately increasing their robustness and applicability for real-world medical tasks. Our code is available at https://github.com/frederikpahde/medical-ai-safety.

查看原文本刊更多论文

确保医疗人工智能安全：可解释性驱动的虚假模型行为和相关数据检测和缓解。

深度神经网络越来越多地应用于高风险的医疗应用，尽管它们倾向于在存在虚假相关性的情况下进行捷径学习，这在实践中可能会产生致命的后果。虽然许多工作都是孤立地解决这种捷径行为的检测或缓解问题，但reveal2revision方法提供了一个综合的偏见缓解框架，将这些步骤结合在一起。然而，有效地解决这些偏见往往需要领域专家大量的标签工作。在这项工作中，我们回顾了reveal2revision框架的步骤，并通过基于可解释性的半自动偏见注释功能对其进行了增强。这包括样本级和特征级偏差注释的方法，为偏差缓解方法提供有价值的信息，以消除不希望的快捷行为。我们使用跨两种模式的四个医疗数据集展示了该框架的适用性，这些数据集具有由数据工件引起的受控和真实的虚假相关性。我们成功地在VGG16、ResNet50和当代Vision Transformer模型中识别并减轻了这些偏差，最终提高了它们对现实世界医疗任务的鲁棒性和适用性。我们的代码可在https://github.com/frederikpahde/medical-ai-safety上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning 工程技术-计算机：人工智能

CiteScore

11.00

自引率

2.70%

发文量

162

审稿时长

3 months

期刊介绍： Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.