HMDA:用于医学图像分割的多尺度可变形注意力混合模型

IF 6.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Mengmeng Wu, Tiantian Liu, Xin Dai, Chuyang Ye, Jinglong Wu, Shintaro Funahashi, Tianyi Yan
{"title":"HMDA:用于医学图像分割的多尺度可变形注意力混合模型","authors":"Mengmeng Wu, Tiantian Liu, Xin Dai, Chuyang Ye, Jinglong Wu, Shintaro Funahashi, Tianyi Yan","doi":"10.1109/JBHI.2024.3469230","DOIUrl":null,"url":null,"abstract":"<p><p>Transformers have been applied to medical image segmentation tasks owing to their excellent longrange modeling capability, compensating for the failure of Convolutional Neural Networks (CNNs) to extract global features. However, the standardized self-attention modules in Transformers, characterized by a uniform and inflexible pattern of attention distribution, frequently lead to unnecessary computational redundancy with high-dimensional data, consequently impeding the model's capacity for precise concentration on salient image regions. Additionally, achieving effective explicit interaction between the spatially detailed features captured by CNNs and the long-range contextual features provided by Transformers remains challenging. In this architecture, we propose a Hybrid Transformer and CNN architecture with Multi-scale Deformable Attention(HMDA), designed to address the aforementioned issues effectively. Specifically, we introduce a Multi-scale Spatially Adaptive Deformable Attention (MSADA) mechanism, which attends to a small set of key sampling points around a reference within the multi-scale features, to achieve better performance. In addition, we propose the Cross Attention Bridge (CAB) module, which integrates multi-scale transformer and local features through channelwise cross attention enriching feature synthesis. HMDA is validated on multiple datasets, and the results demonstrate the effectiveness of our approach, which achieves competitive results compared to the previous methods.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HMDA: A Hybrid Model with Multi-scale Deformable Attention for Medical Image Segmentation.\",\"authors\":\"Mengmeng Wu, Tiantian Liu, Xin Dai, Chuyang Ye, Jinglong Wu, Shintaro Funahashi, Tianyi Yan\",\"doi\":\"10.1109/JBHI.2024.3469230\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Transformers have been applied to medical image segmentation tasks owing to their excellent longrange modeling capability, compensating for the failure of Convolutional Neural Networks (CNNs) to extract global features. However, the standardized self-attention modules in Transformers, characterized by a uniform and inflexible pattern of attention distribution, frequently lead to unnecessary computational redundancy with high-dimensional data, consequently impeding the model's capacity for precise concentration on salient image regions. Additionally, achieving effective explicit interaction between the spatially detailed features captured by CNNs and the long-range contextual features provided by Transformers remains challenging. In this architecture, we propose a Hybrid Transformer and CNN architecture with Multi-scale Deformable Attention(HMDA), designed to address the aforementioned issues effectively. Specifically, we introduce a Multi-scale Spatially Adaptive Deformable Attention (MSADA) mechanism, which attends to a small set of key sampling points around a reference within the multi-scale features, to achieve better performance. In addition, we propose the Cross Attention Bridge (CAB) module, which integrates multi-scale transformer and local features through channelwise cross attention enriching feature synthesis. HMDA is validated on multiple datasets, and the results demonstrate the effectiveness of our approach, which achieves competitive results compared to the previous methods.</p>\",\"PeriodicalId\":13073,\"journal\":{\"name\":\"IEEE Journal of Biomedical and Health Informatics\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2024-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Biomedical and Health Informatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/JBHI.2024.3469230\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2024.3469230","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

变形器具有出色的长距离建模能力,弥补了卷积神经网络(CNN)无法提取全局特征的缺陷,因此已被应用于医学图像分割任务。然而,Transformers 中标准化的自我注意力模块具有注意力分布均匀且不灵活的特点,经常导致高维数据出现不必要的计算冗余,从而阻碍了模型精确集中于突出图像区域的能力。此外,在 CNN 捕捉到的空间细节特征与 Transformers 提供的远距离上下文特征之间实现有效的明确互动仍具有挑战性。在本架构中,我们提出了一种具有多尺度可变形注意力(HMDA)的混合变形器和 CNN 架构,旨在有效解决上述问题。具体来说,我们引入了多尺度空间自适应可变形关注(MSADA)机制,该机制关注多尺度特征中参考点周围的一小部分关键采样点,以实现更好的性能。此外,我们还提出了交叉注意桥(CAB)模块,它通过通道交叉注意丰富特征合成,整合了多尺度变换器和局部特征。我们在多个数据集上对 HMDA 进行了验证,结果表明我们的方法非常有效,与之前的方法相比取得了具有竞争力的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
HMDA: A Hybrid Model with Multi-scale Deformable Attention for Medical Image Segmentation.

Transformers have been applied to medical image segmentation tasks owing to their excellent longrange modeling capability, compensating for the failure of Convolutional Neural Networks (CNNs) to extract global features. However, the standardized self-attention modules in Transformers, characterized by a uniform and inflexible pattern of attention distribution, frequently lead to unnecessary computational redundancy with high-dimensional data, consequently impeding the model's capacity for precise concentration on salient image regions. Additionally, achieving effective explicit interaction between the spatially detailed features captured by CNNs and the long-range contextual features provided by Transformers remains challenging. In this architecture, we propose a Hybrid Transformer and CNN architecture with Multi-scale Deformable Attention(HMDA), designed to address the aforementioned issues effectively. Specifically, we introduce a Multi-scale Spatially Adaptive Deformable Attention (MSADA) mechanism, which attends to a small set of key sampling points around a reference within the multi-scale features, to achieve better performance. In addition, we propose the Cross Attention Bridge (CAB) module, which integrates multi-scale transformer and local features through channelwise cross attention enriching feature synthesis. HMDA is validated on multiple datasets, and the results demonstrate the effectiveness of our approach, which achieves competitive results compared to the previous methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Journal of Biomedical and Health Informatics
IEEE Journal of Biomedical and Health Informatics COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
13.60
自引率
6.50%
发文量
1151
期刊介绍: IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信