基于多尺度特征融合的结肠息肉图像分割方法

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology Pub Date : 2025-03-20 DOI:10.1002/ima.70068

Malik Abdul Manan, Jinchao Feng, Shahzad Ahmed, Abdul Raheem

{"title":"基于多尺度特征融合的结肠息肉图像分割方法","authors":"Malik Abdul Manan, Jinchao Feng, Shahzad Ahmed, Abdul Raheem","doi":"10.1002/ima.70068","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Addressing the challenges posed by colorectal polyp variability and imaging inconsistencies in endoscopic images, we propose the multiscale feature fusion booster network (MFFB-Net), a novel deep learning (DL) framework for the semantic segmentation of colorectal polyps to aid in early colorectal cancer detection. Unlike prior models, such as the pyramid vision transformer-based cascaded attention decoder (PVT-CASCADE) and the parallel reverse attention network (PraNet), MFFB-Net enhances segmentation accuracy and efficiency through a unique fusion of multiscale feature extraction in both the encoder and decoder stages, coupled with a booster module for refining fine-grained details and a bottleneck module for efficient feature compression. The network leverages multipath feature extraction with skip connections, capturing both local and global contextual information, and is rigorously evaluated on seven benchmark datasets, including Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, CVC-300, BKAI-IGH, and EndoCV2020. MFFB-Net achieves state-of-the-art (SOTA) performance, with Dice scores of 94.38%, 91.92%, 91.21%, 80.34%, 82.67%, 76.92%, and 74.29% on CVC-ClinicDB, Kvasir, CVC-300, ETIS, CVC-ColonDB, EndoCV2020, and BKAI-IGH, respectively, outperforming existing models in segmentation accuracy and computational efficiency. MFFB-Net achieves real-time processing speeds of 26 FPS with only 1.41 million parameters, making it well suited for real-world clinical applications. The results underscore the robustness of MFFB-Net, demonstrating its potential for real-time deployment in computer-aided diagnosis systems and setting a new benchmark for automated polyp segmentation.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 2","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiscale Feature Fusion Booster Network for Segmentation of Colorectal Polyp\",\"authors\":\"Malik Abdul Manan, Jinchao Feng, Shahzad Ahmed, Abdul Raheem\",\"doi\":\"10.1002/ima.70068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Addressing the challenges posed by colorectal polyp variability and imaging inconsistencies in endoscopic images, we propose the multiscale feature fusion booster network (MFFB-Net), a novel deep learning (DL) framework for the semantic segmentation of colorectal polyps to aid in early colorectal cancer detection. Unlike prior models, such as the pyramid vision transformer-based cascaded attention decoder (PVT-CASCADE) and the parallel reverse attention network (PraNet), MFFB-Net enhances segmentation accuracy and efficiency through a unique fusion of multiscale feature extraction in both the encoder and decoder stages, coupled with a booster module for refining fine-grained details and a bottleneck module for efficient feature compression. The network leverages multipath feature extraction with skip connections, capturing both local and global contextual information, and is rigorously evaluated on seven benchmark datasets, including Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, CVC-300, BKAI-IGH, and EndoCV2020. MFFB-Net achieves state-of-the-art (SOTA) performance, with Dice scores of 94.38%, 91.92%, 91.21%, 80.34%, 82.67%, 76.92%, and 74.29% on CVC-ClinicDB, Kvasir, CVC-300, ETIS, CVC-ColonDB, EndoCV2020, and BKAI-IGH, respectively, outperforming existing models in segmentation accuracy and computational efficiency. MFFB-Net achieves real-time processing speeds of 26 FPS with only 1.41 million parameters, making it well suited for real-world clinical applications. The results underscore the robustness of MFFB-Net, demonstrating its potential for real-time deployment in computer-aided diagnosis systems and setting a new benchmark for automated polyp segmentation.</p>\\n </div>\",\"PeriodicalId\":14027,\"journal\":{\"name\":\"International Journal of Imaging Systems and Technology\",\"volume\":\"35 2\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Imaging Systems and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ima.70068\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70068","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

为了解决内镜下结肠直肠息肉变异性和成像不一致性带来的挑战，我们提出了多尺度特征融合促进网络（MFFB-Net），这是一种新的深度学习（DL）框架，用于结肠直肠息肉的语义分割，以帮助早期发现结肠直肠癌。与先前的模型（如基于金字塔视觉变压器的级联注意解码器（PVT-CASCADE）和并行反向注意网络（PraNet））不同，MFFB-Net通过在编码器和解码器阶段独特地融合多尺度特征提取，再加上用于细化细粒度细节的增强模块和用于高效特征压缩的瓶颈模块，提高了分割的准确性和效率。该网络利用带有跳过连接的多路径特征提取，捕获本地和全局上下文信息，并在七个基准数据集上进行严格评估，包括Kvasir、CVC-ClinicDB、CVC-ColonDB、ETIS、CVC-300、bkaiigh和EndoCV2020。MFFB-Net达到了最先进（SOTA）的性能，在CVC-ClinicDB、Kvasir、CVC-300、ETIS、CVC-ColonDB、EndoCV2020和bkaiigh上的Dice得分分别为94.38%、91.92%、91.21%、80.34%、82.67%、76.92%和74.29%，在分割精度和计算效率上优于现有模型。MFFB-Net实现了26 FPS的实时处理速度，只有141万个参数，使其非常适合现实世界的临床应用。结果强调了MFFB-Net的鲁棒性，展示了其在计算机辅助诊断系统中实时部署的潜力，并为自动息肉分割设定了新的基准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multiscale Feature Fusion Booster Network for Segmentation of Colorectal Polyp

Addressing the challenges posed by colorectal polyp variability and imaging inconsistencies in endoscopic images, we propose the multiscale feature fusion booster network (MFFB-Net), a novel deep learning (DL) framework for the semantic segmentation of colorectal polyps to aid in early colorectal cancer detection. Unlike prior models, such as the pyramid vision transformer-based cascaded attention decoder (PVT-CASCADE) and the parallel reverse attention network (PraNet), MFFB-Net enhances segmentation accuracy and efficiency through a unique fusion of multiscale feature extraction in both the encoder and decoder stages, coupled with a booster module for refining fine-grained details and a bottleneck module for efficient feature compression. The network leverages multipath feature extraction with skip connections, capturing both local and global contextual information, and is rigorously evaluated on seven benchmark datasets, including Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, CVC-300, BKAI-IGH, and EndoCV2020. MFFB-Net achieves state-of-the-art (SOTA) performance, with Dice scores of 94.38%, 91.92%, 91.21%, 80.34%, 82.67%, 76.92%, and 74.29% on CVC-ClinicDB, Kvasir, CVC-300, ETIS, CVC-ColonDB, EndoCV2020, and BKAI-IGH, respectively, outperforming existing models in segmentation accuracy and computational efficiency. MFFB-Net achieves real-time processing speeds of 26 FPS with only 1.41 million parameters, making it well suited for real-world clinical applications. The results underscore the robustness of MFFB-Net, demonstrating its potential for real-time deployment in computer-aided diagnosis systems and setting a new benchmark for automated polyp segmentation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Imaging Systems and Technology 工程技术-成像科学与照相技术

CiteScore

6.90

自引率

6.10%

发文量

138

审稿时长

3 months

期刊介绍： The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals. IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging. The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered. The scope of the journal includes, but is not limited to, the following in the context of biomedical research: Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.; Neuromodulation and brain stimulation techniques such as TMS and tDCS; Software and hardware for imaging, especially related to human and animal health; Image segmentation in normal and clinical populations; Pattern analysis and classification using machine learning techniques; Computational modeling and analysis; Brain connectivity and connectomics; Systems-level characterization of brain function; Neural networks and neurorobotics; Computer vision, based on human/animal physiology; Brain-computer interface (BCI) technology; Big data, databasing and data mining.