Explainable variable-weight multi-modal based deep learning framework for catheter malposition detection

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yuhan Wang, Hak Keung Lam
{"title":"Explainable variable-weight multi-modal based deep learning framework for catheter malposition detection","authors":"Yuhan Wang,&nbsp;Hak Keung Lam","doi":"10.1016/j.inffus.2025.103170","DOIUrl":null,"url":null,"abstract":"<div><div>Hospital patients may have catheters and lines inserted for quick administration of medicines or medical tests. However, a misplaced catheter can cause serious complications, even death. Recently, deep learning frameworks have shown their potential to assist in detecting catheter malposition in radiography. However, the deep learning malposition detection frameworks meet three main challenges: (1) Most approaches rely heavily on visual information, requiring models with many parameters for accurate detection. (2) Geometric information in radiography that is important for experts for decision making is often underutilized due to the inherent complexities in accurately extracting and integrating it with visual information. (3) Feature significance in catheter status detection is often underexplored, making the framework difficult to interpret and requiring a mechanism to highlight key factors influencing decisions. Therefore, to address these challenges, an explainable variable-weight multimodal based deep learning framework is proposed to fuse the visual and geometric information in the radiography for catheter malposition detection. The convolution neural network (CNN) stream and the graph convolution neural network (GCN) stream, with few learnable parameters, are designed to extract the visual and geometric information without compromising performance. The cross-modal attention block is proposed to capture the relationship between visual and geometric information. Furthermore, the multimodal variable-weight structure is proposed to fuse different modalities based on their significance. To visualize the contribution of each modality, the multimodal class activation map (MCAM) is designed to visualize the activated region in radiography, showing where the framework focuses. The proposed method obtains state-of-the-art performance, gaining 0.8816 mean AUC with 7.62 million parameters.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103170"},"PeriodicalIF":14.7000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352500243X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Hospital patients may have catheters and lines inserted for quick administration of medicines or medical tests. However, a misplaced catheter can cause serious complications, even death. Recently, deep learning frameworks have shown their potential to assist in detecting catheter malposition in radiography. However, the deep learning malposition detection frameworks meet three main challenges: (1) Most approaches rely heavily on visual information, requiring models with many parameters for accurate detection. (2) Geometric information in radiography that is important for experts for decision making is often underutilized due to the inherent complexities in accurately extracting and integrating it with visual information. (3) Feature significance in catheter status detection is often underexplored, making the framework difficult to interpret and requiring a mechanism to highlight key factors influencing decisions. Therefore, to address these challenges, an explainable variable-weight multimodal based deep learning framework is proposed to fuse the visual and geometric information in the radiography for catheter malposition detection. The convolution neural network (CNN) stream and the graph convolution neural network (GCN) stream, with few learnable parameters, are designed to extract the visual and geometric information without compromising performance. The cross-modal attention block is proposed to capture the relationship between visual and geometric information. Furthermore, the multimodal variable-weight structure is proposed to fuse different modalities based on their significance. To visualize the contribution of each modality, the multimodal class activation map (MCAM) is designed to visualize the activated region in radiography, showing where the framework focuses. The proposed method obtains state-of-the-art performance, gaining 0.8816 mean AUC with 7.62 million parameters.
基于可解释变权多模态的导管错位检测深度学习框架
医院的病人可能需要插入导管和导管,以便快速给药或进行医学检查。然而,放错位置的导管会导致严重的并发症,甚至死亡。最近,深度学习框架已经显示出它们在帮助检测放射照相中导管错位方面的潜力。然而,深度学习错位检测框架面临三个主要挑战:(1)大多数方法严重依赖视觉信息,需要具有许多参数的模型才能准确检测。(2)射线照相中的几何信息对专家的决策具有重要意义,但由于其与视觉信息的准确提取和整合具有固有的复杂性,往往没有得到充分利用。(3)导管状态检测中的特征意义往往未被充分挖掘,使得框架难以解释,需要一种机制来突出影响决策的关键因素。因此,为了解决这些挑战,提出了一种可解释的变权多模态深度学习框架,融合x线摄影中的视觉和几何信息,用于导管错位检测。卷积神经网络(CNN)流和图卷积神经网络(GCN)流具有较少的可学习参数,可以在不影响性能的情况下提取视觉和几何信息。为了捕捉视觉信息和几何信息之间的关系,提出了跨模态注意块。在此基础上,提出了多模态变权结构,根据不同模态的重要程度进行融合。为了可视化每个模态的贡献,设计了多模态类激活图(MCAM)来可视化放射照相中的激活区域,显示框架的重点。该方法获得了最先进的性能,在762万个参数下获得了0.8816的平均AUC。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信