基于交叉注意引导损失的深度双分支融合网络用于肝脏肿瘤分类

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2024-09-24 DOI:10.1016/j.inffus.2024.102713

Rui Wang , Xiaoshuang Shi , Shuting Pang , Yidi Chen , Xiaofeng Zhu , Wentao Wang , Jiabin Cai , Danjun Song , Kang Li

{"title":"基于交叉注意引导损失的深度双分支融合网络用于肝脏肿瘤分类","authors":"Rui Wang , Xiaoshuang Shi , Shuting Pang , Yidi Chen , Xiaofeng Zhu , Wentao Wang , Jiabin Cai , Danjun Song , Kang Li","doi":"10.1016/j.inffus.2024.102713","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, convolutional neural networks (CNNs) and multiple instance learning (MIL) methods have been successfully applied to MRI images. However, CNNs directly utilize the whole image as the model input and the downsampling strategy (like max or mean pooling) to reduce the size of the feature map, thereby possibly neglecting some local details. And MIL methods learn instance-level or local features without considering spatial information. To overcome these issues, in this paper, we propose a novel cross-attention guided loss-based dual-branch framework (LCA-DB) to leverage spatial and local image information simultaneously, which is composed of an image-based attention network (IA-Net), a patch-based attention network (PA-Net) and a cross-attention module (CA). Specifically, IA-Net directly learns image features with loss-based attention to mine significant regions, meanwhile, PA-Net captures patch-specific representations to extract crucial patches related to the tumor. Additionally, the cross-attention module is designed to integrate patch-level features by using attention weights generated from each other, thereby assisting them in mining supplement region information and enhancing the interactive collaboration of the two branches. Moreover, we employ an attention similarity loss to further reduce the semantic inconsistency of attention weights obtained from the two branches. Finally, extensive experiments on three liver tumor classification tasks demonstrate the effectiveness of the proposed framework, e.g., on the LLD-MMRI–7, our method achieves 69.2%, 65.9% and 88.5% on the seven-class liver tumor classification tasks in terms of accuracy, F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span> score and AUC, with the superior classification and interpretation performance over recent state-of-the-art methods. The source code of LCA-DB is available at <span><span>https://github.com/Wangrui-berry/Cross-attention</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102713"},"PeriodicalIF":14.7000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-attention guided loss-based deep dual-branch fusion network for liver tumor classification\",\"authors\":\"Rui Wang , Xiaoshuang Shi , Shuting Pang , Yidi Chen , Xiaofeng Zhu , Wentao Wang , Jiabin Cai , Danjun Song , Kang Li\",\"doi\":\"10.1016/j.inffus.2024.102713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recently, convolutional neural networks (CNNs) and multiple instance learning (MIL) methods have been successfully applied to MRI images. However, CNNs directly utilize the whole image as the model input and the downsampling strategy (like max or mean pooling) to reduce the size of the feature map, thereby possibly neglecting some local details. And MIL methods learn instance-level or local features without considering spatial information. To overcome these issues, in this paper, we propose a novel cross-attention guided loss-based dual-branch framework (LCA-DB) to leverage spatial and local image information simultaneously, which is composed of an image-based attention network (IA-Net), a patch-based attention network (PA-Net) and a cross-attention module (CA). Specifically, IA-Net directly learns image features with loss-based attention to mine significant regions, meanwhile, PA-Net captures patch-specific representations to extract crucial patches related to the tumor. Additionally, the cross-attention module is designed to integrate patch-level features by using attention weights generated from each other, thereby assisting them in mining supplement region information and enhancing the interactive collaboration of the two branches. Moreover, we employ an attention similarity loss to further reduce the semantic inconsistency of attention weights obtained from the two branches. Finally, extensive experiments on three liver tumor classification tasks demonstrate the effectiveness of the proposed framework, e.g., on the LLD-MMRI–7, our method achieves 69.2%, 65.9% and 88.5% on the seven-class liver tumor classification tasks in terms of accuracy, F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span> score and AUC, with the superior classification and interpretation performance over recent state-of-the-art methods. The source code of LCA-DB is available at <span><span>https://github.com/Wangrui-berry/Cross-attention</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"114 \",\"pages\":\"Article 102713\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253524004913\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004913","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

最近，卷积神经网络（CNN）和多实例学习（MIL）方法已成功应用于核磁共振成像。然而，卷积神经网络直接利用整个图像作为模型输入，并采用降采样策略（如最大值或均值池化）来缩小特征图的大小，从而可能忽略一些局部细节。而 MIL 方法只学习实例级或局部特征，不考虑空间信息。为了克服这些问题，我们在本文中提出了一种新颖的基于交叉注意力引导的损失双分支框架（LCA-DB），它由基于图像的注意力网络（IA-Net）、基于斑块的注意力网络（PA-Net）和交叉注意力模块（CA）组成，可同时利用空间和局部图像信息。具体来说，IA-Net 通过基于损失的注意力直接学习图像特征，挖掘重要区域；PA-Net 则捕捉特定的斑块表征，提取与肿瘤相关的关键斑块。此外，交叉注意力模块旨在通过使用彼此生成的注意力权重来整合补丁级特征，从而帮助它们挖掘补充区域信息，增强两个分支的交互协作。此外，我们还采用了注意力相似性损失来进一步降低两个分支所获得的注意力权重在语义上的不一致性。最后，在三个肝脏肿瘤分类任务上的大量实验证明了所提框架的有效性，例如，在 LLD-MMRI-7 七类肝脏肿瘤分类任务上，我们的方法在准确率、F1 分数和 AUC 方面分别达到了 69.2%、65.9% 和 88.5%，分类和解释性能均优于最近的先进方法。LCA-DB的源代码可在https://github.com/Wangrui-berry/Cross-attention。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cross-attention guided loss-based deep dual-branch fusion network for liver tumor classification

Recently, convolutional neural networks (CNNs) and multiple instance learning (MIL) methods have been successfully applied to MRI images. However, CNNs directly utilize the whole image as the model input and the downsampling strategy (like max or mean pooling) to reduce the size of the feature map, thereby possibly neglecting some local details. And MIL methods learn instance-level or local features without considering spatial information. To overcome these issues, in this paper, we propose a novel cross-attention guided loss-based dual-branch framework (LCA-DB) to leverage spatial and local image information simultaneously, which is composed of an image-based attention network (IA-Net), a patch-based attention network (PA-Net) and a cross-attention module (CA). Specifically, IA-Net directly learns image features with loss-based attention to mine significant regions, meanwhile, PA-Net captures patch-specific representations to extract crucial patches related to the tumor. Additionally, the cross-attention module is designed to integrate patch-level features by using attention weights generated from each other, thereby assisting them in mining supplement region information and enhancing the interactive collaboration of the two branches. Moreover, we employ an attention similarity loss to further reduce the semantic inconsistency of attention weights obtained from the two branches. Finally, extensive experiments on three liver tumor classification tasks demonstrate the effectiveness of the proposed framework, e.g., on the LLD-MMRI–7, our method achieves 69.2%, 65.9% and 88.5% on the seven-class liver tumor classification tasks in terms of accuracy, F

_{1}

score and AUC, with the superior classification and interpretation performance over recent state-of-the-art methods. The source code of LCA-DB is available at https://github.com/Wangrui-berry/Cross-attention.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.