TAU-EffNetB7：一种利用effentnetb7增强息肉分割的新型三重注意力U-Net方法

IF 2.5 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology Pub Date : 2025-06-21 DOI:10.1002/ima.70144

Fouzia El Abassi, Aziz Darouichi, Aziz Ouaarab

{"title":"TAU-EffNetB7：一种利用effentnetb7增强息肉分割的新型三重注意力U-Net方法","authors":"Fouzia El Abassi, Aziz Darouichi, Aziz Ouaarab","doi":"10.1002/ima.70144","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Polyp segmentation is a critical but challenging process in clinical imaging since colonoscopic images are inherently complex and heterogeneous. Conventional single-stage segmentation networks lack good generalization and achieve only acceptable accuracy, particularly for small or uncertain polyps. To address these constraints, we propose two new models: TAU-EffNetB7 and TAU-EffNetB7 + Residual. These models apply triple-attention U-Net and triple-attention residual architectures, respectively, and incorporate cascaded stages, attention and residual operations, Atrous Spatial Pyramid Pooling, and transfer learning from EfficientNetB7. The multi-stage architecture enables progressive refinement of segmentations, better capture of multi-scale features, and accurate depiction of intricate boundaries. We evaluate our models on three publicly available colonoscopic datasets: Kvasir-SEG, CVC-ClinicDB, and CVC-ColonDB. The TAU-EffNetB7 attains Dice Similarity Coefficients (DSC) of 89.54%, 94.62%, and 94.68% on each dataset, respectively. The TAU-EffNetB7 + Residual model performs even better, achieving DSCs of 91.11%, 93.74%, and 94.72%, significantly outperforming baseline models such as U-Net and Attention U-Net. To assess generalization, we carry out experiments where models are trained with small subsets of data (Kvasir-SEG1, CVC-ClinicDB1, and CVC-ColonDB1) and tested on the full datasets. Both models demonstrate strong performance even with limited training data. TAU-EffNetB7 achieves 90.18% DSC when trained on Kvasir-SEG1, whereas TAU-EffNetB7 + Residual achieves 94.17% on CVC-ClinicDB and 94.68% on CVC-ColonDB when trained on their respective subsets. Notably, the residual-augmented model outperforms its counterpart in all but a few low-data scenarios.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TAU-EffNetB7: A Novel Triple Attention U-Net Approach Using EfficientNetB7 for Enhanced Polyp Segmentation\",\"authors\":\"Fouzia El Abassi, Aziz Darouichi, Aziz Ouaarab\",\"doi\":\"10.1002/ima.70144\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Polyp segmentation is a critical but challenging process in clinical imaging since colonoscopic images are inherently complex and heterogeneous. Conventional single-stage segmentation networks lack good generalization and achieve only acceptable accuracy, particularly for small or uncertain polyps. To address these constraints, we propose two new models: TAU-EffNetB7 and TAU-EffNetB7 + Residual. These models apply triple-attention U-Net and triple-attention residual architectures, respectively, and incorporate cascaded stages, attention and residual operations, Atrous Spatial Pyramid Pooling, and transfer learning from EfficientNetB7. The multi-stage architecture enables progressive refinement of segmentations, better capture of multi-scale features, and accurate depiction of intricate boundaries. We evaluate our models on three publicly available colonoscopic datasets: Kvasir-SEG, CVC-ClinicDB, and CVC-ColonDB. The TAU-EffNetB7 attains Dice Similarity Coefficients (DSC) of 89.54%, 94.62%, and 94.68% on each dataset, respectively. The TAU-EffNetB7 + Residual model performs even better, achieving DSCs of 91.11%, 93.74%, and 94.72%, significantly outperforming baseline models such as U-Net and Attention U-Net. To assess generalization, we carry out experiments where models are trained with small subsets of data (Kvasir-SEG1, CVC-ClinicDB1, and CVC-ColonDB1) and tested on the full datasets. Both models demonstrate strong performance even with limited training data. TAU-EffNetB7 achieves 90.18% DSC when trained on Kvasir-SEG1, whereas TAU-EffNetB7 + Residual achieves 94.17% on CVC-ClinicDB and 94.68% on CVC-ColonDB when trained on their respective subsets. Notably, the residual-augmented model outperforms its counterpart in all but a few low-data scenarios.</p>\\n </div>\",\"PeriodicalId\":14027,\"journal\":{\"name\":\"International Journal of Imaging Systems and Technology\",\"volume\":\"35 4\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Imaging Systems and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ima.70144\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70144","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

由于结肠镜图像本身是复杂和异构的，息肉分割是临床成像中一个关键但具有挑战性的过程。传统的单阶段分割网络缺乏良好的泛化，只能达到可接受的精度，特别是对于小的或不确定的息肉。为了解决这些限制，我们提出了两个新的模型：TAU-EffNetB7和TAU-EffNetB7 +残差。这些模型分别应用了三注意力U-Net和三注意力残差架构，并结合了级联阶段、注意力和残差操作、Atrous空间金字塔池和effentnetb7的迁移学习。多阶段架构可以逐步细化分割，更好地捕获多尺度特征，并准确描述复杂的边界。我们在三个公开可用的结肠镜数据集上评估我们的模型：Kvasir-SEG， CVC-ClinicDB和CVC-ColonDB。TAU-EffNetB7在每个数据集上的骰子相似系数（DSC）分别为89.54%、94.62%和94.68%。TAU-EffNetB7 +残差模型表现更好，dsc分别为91.11%、93.74%和94.72%，显著优于基准模型如U-Net和Attention U-Net。为了评估泛化，我们进行了实验，其中模型使用小数据子集（Kvasir-SEG1， CVC-ClinicDB1和CVC-ColonDB1）进行训练，并在完整数据集上进行测试。这两种模型即使在有限的训练数据下也表现出很强的性能。当在Kvasir-SEG1上训练时，TAU-EffNetB7达到90.18%的DSC，而在各自的子集上训练时，TAU-EffNetB7 +残差在CVC-ClinicDB上达到94.17%，在CVC-ColonDB上达到94.68%。值得注意的是，残差增强模型在除少数低数据场景外的所有情况下都优于其对应模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TAU-EffNetB7: A Novel Triple Attention U-Net Approach Using EfficientNetB7 for Enhanced Polyp Segmentation

Polyp segmentation is a critical but challenging process in clinical imaging since colonoscopic images are inherently complex and heterogeneous. Conventional single-stage segmentation networks lack good generalization and achieve only acceptable accuracy, particularly for small or uncertain polyps. To address these constraints, we propose two new models: TAU-EffNetB7 and TAU-EffNetB7 + Residual. These models apply triple-attention U-Net and triple-attention residual architectures, respectively, and incorporate cascaded stages, attention and residual operations, Atrous Spatial Pyramid Pooling, and transfer learning from EfficientNetB7. The multi-stage architecture enables progressive refinement of segmentations, better capture of multi-scale features, and accurate depiction of intricate boundaries. We evaluate our models on three publicly available colonoscopic datasets: Kvasir-SEG, CVC-ClinicDB, and CVC-ColonDB. The TAU-EffNetB7 attains Dice Similarity Coefficients (DSC) of 89.54%, 94.62%, and 94.68% on each dataset, respectively. The TAU-EffNetB7 + Residual model performs even better, achieving DSCs of 91.11%, 93.74%, and 94.72%, significantly outperforming baseline models such as U-Net and Attention U-Net. To assess generalization, we carry out experiments where models are trained with small subsets of data (Kvasir-SEG1, CVC-ClinicDB1, and CVC-ColonDB1) and tested on the full datasets. Both models demonstrate strong performance even with limited training data. TAU-EffNetB7 achieves 90.18% DSC when trained on Kvasir-SEG1, whereas TAU-EffNetB7 + Residual achieves 94.17% on CVC-ClinicDB and 94.68% on CVC-ColonDB when trained on their respective subsets. Notably, the residual-augmented model outperforms its counterpart in all but a few low-data scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Imaging Systems and Technology 工程技术-成像科学与照相技术

CiteScore

6.90

自引率

6.10%

发文量

138

审稿时长

3 months

期刊介绍： The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals. IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging. The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered. The scope of the journal includes, but is not limited to, the following in the context of biomedical research: Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.; Neuromodulation and brain stimulation techniques such as TMS and tDCS; Software and hardware for imaging, especially related to human and animal health; Image segmentation in normal and clinical populations; Pattern analysis and classification using machine learning techniques; Computational modeling and analysis; Brain connectivity and connectomics; Systems-level characterization of brain function; Neural networks and neurorobotics; Computer vision, based on human/animal physiology; Brain-computer interface (BCI) technology; Big data, databasing and data mining.