What are you looking at? Modality contribution in multimodal medical deep learning.

IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL
Christian Gapp, Elias Tappeiner, Martin Welk, Karl Fritscher, Elke R Gizewski, Rainer Schubert
{"title":"What are you looking at? Modality contribution in multimodal medical deep learning.","authors":"Christian Gapp, Elias Tappeiner, Martin Welk, Karl Fritscher, Elke R Gizewski, Rainer Schubert","doi":"10.1007/s11548-025-03523-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>High dimensional, multimodal data can nowadays be analyzed by huge deep neural networks with little effort. Several fusion methods for bringing together different modalities have been developed. Given the prevalence of high-dimensional, multimodal patient data in medicine, the development of multimodal models marks a significant advancement. However, how these models process information from individual sources in detail is still underexplored.</p><p><strong>Methods: </strong>To this end, we implemented an occlusion-based modality contribution method that is both model- and performance agnostic. This method quantitatively measures the importance of each modality in the dataset for the model to fulfill its task. We applied our method to three different multimodal medical problems for experimental purposes.</p><p><strong>Results: </strong>Herein we found that some networks have modality preferences that tend to unimodal collapses, while some datasets are imbalanced from the ground up. Moreover, we provide fine-grained quantitative and visual attribute importance for each modality.</p><p><strong>Conclusion: </strong>Our metric offers valuable insights that can support the advancement of multimodal model development and dataset creation. By introducing this method, we contribute to the growing field of interpretability in deep learning for multimodal research. This approach helps to facilitate the integration of multimodal AI into clinical practice. Our code is publicly available at https://github.com/ChristianGappGit/MC_MMD.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03523-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: High dimensional, multimodal data can nowadays be analyzed by huge deep neural networks with little effort. Several fusion methods for bringing together different modalities have been developed. Given the prevalence of high-dimensional, multimodal patient data in medicine, the development of multimodal models marks a significant advancement. However, how these models process information from individual sources in detail is still underexplored.

Methods: To this end, we implemented an occlusion-based modality contribution method that is both model- and performance agnostic. This method quantitatively measures the importance of each modality in the dataset for the model to fulfill its task. We applied our method to three different multimodal medical problems for experimental purposes.

Results: Herein we found that some networks have modality preferences that tend to unimodal collapses, while some datasets are imbalanced from the ground up. Moreover, we provide fine-grained quantitative and visual attribute importance for each modality.

Conclusion: Our metric offers valuable insights that can support the advancement of multimodal model development and dataset creation. By introducing this method, we contribute to the growing field of interpretability in deep learning for multimodal research. This approach helps to facilitate the integration of multimodal AI into clinical practice. Our code is publicly available at https://github.com/ChristianGappGit/MC_MMD.

你在看什么?模态对多模态医学深度学习的贡献。
目的:高维、多模态的数据可以用庞大的深度神经网络进行分析。已经开发了几种将不同模式结合在一起的融合方法。鉴于医学中高维、多模态患者数据的流行,多模态模型的发展标志着一个重大的进步。然而,这些模型如何处理来自单个来源的详细信息仍未得到充分探讨。方法:为此,我们实现了一种基于遮挡的模态贡献方法,该方法与模型和性能无关。该方法定量度量数据集中每个模态对模型完成任务的重要性。为了实验目的,我们将我们的方法应用于三个不同的多模式医学问题。结果:在这里,我们发现一些网络具有倾向于单峰崩溃的模态偏好,而一些数据集从头开始是不平衡的。此外,我们为每个模态提供了细粒度的定量和可视化属性重要性。结论:我们的度量提供了有价值的见解,可以支持多模态模型开发和数据集创建的进步。通过引入这种方法,我们为多模态深度学习研究的可解释性领域做出了贡献。这种方法有助于促进将多模式人工智能整合到临床实践中。我们的代码可以在https://github.com/ChristianGappGit/MC_MMD上公开获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Computer Assisted Radiology and Surgery
International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
5.90
自引率
6.70%
发文量
243
审稿时长
6-12 weeks
期刊介绍: The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信