Malik Abdul Manan, Jinchao Feng, Shahzad Ahmed, Abdul Raheem
{"title":"Multiscale Feature Fusion Booster Network for Segmentation of Colorectal Polyp","authors":"Malik Abdul Manan, Jinchao Feng, Shahzad Ahmed, Abdul Raheem","doi":"10.1002/ima.70068","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Addressing the challenges posed by colorectal polyp variability and imaging inconsistencies in endoscopic images, we propose the multiscale feature fusion booster network (MFFB-Net), a novel deep learning (DL) framework for the semantic segmentation of colorectal polyps to aid in early colorectal cancer detection. Unlike prior models, such as the pyramid vision transformer-based cascaded attention decoder (PVT-CASCADE) and the parallel reverse attention network (PraNet), MFFB-Net enhances segmentation accuracy and efficiency through a unique fusion of multiscale feature extraction in both the encoder and decoder stages, coupled with a booster module for refining fine-grained details and a bottleneck module for efficient feature compression. The network leverages multipath feature extraction with skip connections, capturing both local and global contextual information, and is rigorously evaluated on seven benchmark datasets, including Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, CVC-300, BKAI-IGH, and EndoCV2020. MFFB-Net achieves state-of-the-art (SOTA) performance, with Dice scores of 94.38%, 91.92%, 91.21%, 80.34%, 82.67%, 76.92%, and 74.29% on CVC-ClinicDB, Kvasir, CVC-300, ETIS, CVC-ColonDB, EndoCV2020, and BKAI-IGH, respectively, outperforming existing models in segmentation accuracy and computational efficiency. MFFB-Net achieves real-time processing speeds of 26 FPS with only 1.41 million parameters, making it well suited for real-world clinical applications. The results underscore the robustness of MFFB-Net, demonstrating its potential for real-time deployment in computer-aided diagnosis systems and setting a new benchmark for automated polyp segmentation.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 2","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70068","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Addressing the challenges posed by colorectal polyp variability and imaging inconsistencies in endoscopic images, we propose the multiscale feature fusion booster network (MFFB-Net), a novel deep learning (DL) framework for the semantic segmentation of colorectal polyps to aid in early colorectal cancer detection. Unlike prior models, such as the pyramid vision transformer-based cascaded attention decoder (PVT-CASCADE) and the parallel reverse attention network (PraNet), MFFB-Net enhances segmentation accuracy and efficiency through a unique fusion of multiscale feature extraction in both the encoder and decoder stages, coupled with a booster module for refining fine-grained details and a bottleneck module for efficient feature compression. The network leverages multipath feature extraction with skip connections, capturing both local and global contextual information, and is rigorously evaluated on seven benchmark datasets, including Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, CVC-300, BKAI-IGH, and EndoCV2020. MFFB-Net achieves state-of-the-art (SOTA) performance, with Dice scores of 94.38%, 91.92%, 91.21%, 80.34%, 82.67%, 76.92%, and 74.29% on CVC-ClinicDB, Kvasir, CVC-300, ETIS, CVC-ColonDB, EndoCV2020, and BKAI-IGH, respectively, outperforming existing models in segmentation accuracy and computational efficiency. MFFB-Net achieves real-time processing speeds of 26 FPS with only 1.41 million parameters, making it well suited for real-world clinical applications. The results underscore the robustness of MFFB-Net, demonstrating its potential for real-time deployment in computer-aided diagnosis systems and setting a new benchmark for automated polyp segmentation.
期刊介绍:
The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals.
IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging.
The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered.
The scope of the journal includes, but is not limited to, the following in the context of biomedical research:
Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.;
Neuromodulation and brain stimulation techniques such as TMS and tDCS;
Software and hardware for imaging, especially related to human and animal health;
Image segmentation in normal and clinical populations;
Pattern analysis and classification using machine learning techniques;
Computational modeling and analysis;
Brain connectivity and connectomics;
Systems-level characterization of brain function;
Neural networks and neurorobotics;
Computer vision, based on human/animal physiology;
Brain-computer interface (BCI) technology;
Big data, databasing and data mining.