PVT-MA: pyramid vision transformers with multi-attention fusion mechanism for polyp segmentation

IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xiao Shang, Siqi Wu, Yuhao Liu, Zhenfeng Zhao, Shenwen Wang
{"title":"PVT-MA: pyramid vision transformers with multi-attention fusion mechanism for polyp segmentation","authors":"Xiao Shang,&nbsp;Siqi Wu,&nbsp;Yuhao Liu,&nbsp;Zhenfeng Zhao,&nbsp;Shenwen Wang","doi":"10.1007/s10489-024-06041-5","DOIUrl":null,"url":null,"abstract":"<div><p>Early diagnosis and prevention of colorectal cancer rely on colonoscopic polyp examination.Accurate automated polyp segmentation technology can assist clinicians in precisely identifying polyp regions, thereby conserving medical resources. Although deep learning-based image processing methods have shown immense potential in the field of automatic polyp segmentation, current automatic segmentation methods for colorectal polyps are still limited by factors such as the complex and variable intestinal environment and issues related to detection equipment like glare and motion blur. These limitations result in an inability to accurately distinguish polyps from surrounding mucosal tissue and effectively identify tiny polyps. To address these challenges, we designed a multi-attention-based model, PVT-MA. Specifically, we developed the Cascading Attention Fusion (CAF) Module to accurately identify and locate polyps, reducing false positives caused by environmental factors and glare. Additionally, we introduced the Series Channels Coordinate Attention (SCC) Module to maximize the capture of polyp edge information. Furthermore, we incorporated the Receptive Field Block (RFB) Module to enhance polyp features and filter image noise.We conducted quantitative and qualitative evaluations using six metrics across four challenging datasets. Our PVT-MA model achieved top performance on three datasets and ranked second on one. The model has only 26.39M parameters, a computational cost of 10.33 GFlops, and delivers inference at a high speed of 47.6 frames per second (FPS).</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-06041-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Early diagnosis and prevention of colorectal cancer rely on colonoscopic polyp examination.Accurate automated polyp segmentation technology can assist clinicians in precisely identifying polyp regions, thereby conserving medical resources. Although deep learning-based image processing methods have shown immense potential in the field of automatic polyp segmentation, current automatic segmentation methods for colorectal polyps are still limited by factors such as the complex and variable intestinal environment and issues related to detection equipment like glare and motion blur. These limitations result in an inability to accurately distinguish polyps from surrounding mucosal tissue and effectively identify tiny polyps. To address these challenges, we designed a multi-attention-based model, PVT-MA. Specifically, we developed the Cascading Attention Fusion (CAF) Module to accurately identify and locate polyps, reducing false positives caused by environmental factors and glare. Additionally, we introduced the Series Channels Coordinate Attention (SCC) Module to maximize the capture of polyp edge information. Furthermore, we incorporated the Receptive Field Block (RFB) Module to enhance polyp features and filter image noise.We conducted quantitative and qualitative evaluations using six metrics across four challenging datasets. Our PVT-MA model achieved top performance on three datasets and ranked second on one. The model has only 26.39M parameters, a computational cost of 10.33 GFlops, and delivers inference at a high speed of 47.6 frames per second (FPS).

PVT-MA:采用多注意力融合机制的金字塔视觉变换器,用于息肉分割
大肠癌的早期诊断和预防有赖于结肠镜息肉检查。准确的息肉自动分割技术可以帮助临床医生精确识别息肉区域,从而节约医疗资源。尽管基于深度学习的图像处理方法已在息肉自动分割领域显示出巨大潜力,但目前的结直肠息肉自动分割方法仍受到一些因素的限制,如复杂多变的肠道环境以及与检测设备相关的问题,如眩光和运动模糊。这些限制导致无法准确区分息肉和周围粘膜组织,也无法有效识别微小息肉。为了应对这些挑战,我们设计了一种基于多注意力的模型 PVT-MA。具体来说,我们开发了级联注意力融合(CAF)模块,以准确识别和定位息肉,减少环境因素和眩光造成的假阳性。此外,我们还引入了串联通道协调注意(SCC)模块,以最大限度地捕捉息肉边缘信息。我们在四个具有挑战性的数据集上使用六个指标进行了定量和定性评估。我们的 PVT-MA 模型在三个数据集上取得了最高性能,在一个数据集上排名第二。该模型仅有 26.39M 个参数,计算成本为 10.33 GFlops,推理速度高达每秒 47.6 帧 (FPS)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Intelligence
Applied Intelligence 工程技术-计算机:人工智能
CiteScore
6.60
自引率
20.80%
发文量
1361
审稿时长
5.9 months
期刊介绍: With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信