Attention-guided deep framework for polyp localization and subsequent classification via polyp local and Siamese feature fusion.

IF 2.6 4区医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Medical & Biological Engineering & Computing Pub Date : 2025-09-01 Epub Date: 2025-05-02 DOI:10.1007/s11517-025-03369-z

Pradipta Sasmal, Susant Kumar Panigrahi, Swarna Laxmi Panda, M K Bhuyan

{"title":"Attention-guided deep framework for polyp localization and subsequent classification via polyp local and Siamese feature fusion.","authors":"Pradipta Sasmal, Susant Kumar Panigrahi, Swarna Laxmi Panda, M K Bhuyan","doi":"10.1007/s11517-025-03369-z","DOIUrl":null,"url":null,"abstract":"<p><p>Colorectal cancer (CRC) is one of the leading causes of death worldwide. This paper proposes an automated diagnostic technique to detect, localize, and classify polyps in colonoscopy video frames. The proposed model adopts the deep YOLOv4 model that incorporates both spatial and contextual information in the form of spatial attention and channel attention blocks, respectively for better localization of polyps. Finally, leveraging a fusion of deep and handcrafted features, the detected polyps are classified as adenoma or non-adenoma. Polyp shape and texture are essential features in discriminating polyp types. Therefore, the proposed work utilizes a pyramid histogram of oriented gradient (PHOG) and embedding features learned via triplet Siamese architecture to extract these features. The PHOG extracts local shape information from each polyp class, whereas the Siamese network extracts intra-polyp discriminating features. The individual and cross-database performances on two databases suggest the robustness of our method in polyp localization. The competitive analysis based on significant clinical parameters with current state-of-the-art methods confirms that our method can be used for automated polyp localization in both real-time and offline colonoscopic video frames. Our method provides an average precision of 0.8971 and 0.9171 and an F1 score of 0.8869 and 0.8812 for the Kvasir-SEG and SUN databases. Similarly, the proposed classification framework for the detected polyps yields a classification accuracy of 96.66% on a publicly available UCI colonoscopy video dataset. Moreover, the classification framework provides an F1 score of 96.54% that validates the potential of the proposed framework in polyp localization and classification.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":"2795-2814"},"PeriodicalIF":2.6000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical & Biological Engineering & Computing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11517-025-03369-z","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/2 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Colorectal cancer (CRC) is one of the leading causes of death worldwide. This paper proposes an automated diagnostic technique to detect, localize, and classify polyps in colonoscopy video frames. The proposed model adopts the deep YOLOv4 model that incorporates both spatial and contextual information in the form of spatial attention and channel attention blocks, respectively for better localization of polyps. Finally, leveraging a fusion of deep and handcrafted features, the detected polyps are classified as adenoma or non-adenoma. Polyp shape and texture are essential features in discriminating polyp types. Therefore, the proposed work utilizes a pyramid histogram of oriented gradient (PHOG) and embedding features learned via triplet Siamese architecture to extract these features. The PHOG extracts local shape information from each polyp class, whereas the Siamese network extracts intra-polyp discriminating features. The individual and cross-database performances on two databases suggest the robustness of our method in polyp localization. The competitive analysis based on significant clinical parameters with current state-of-the-art methods confirms that our method can be used for automated polyp localization in both real-time and offline colonoscopic video frames. Our method provides an average precision of 0.8971 and 0.9171 and an F1 score of 0.8869 and 0.8812 for the Kvasir-SEG and SUN databases. Similarly, the proposed classification framework for the detected polyps yields a classification accuracy of 96.66% on a publicly available UCI colonoscopy video dataset. Moreover, the classification framework provides an F1 score of 96.54% that validates the potential of the proposed framework in polyp localization and classification.

查看原文本刊更多论文

通过息肉局部特征和暹罗特征融合进行息肉定位和分类的注意引导深度框架。

结直肠癌（CRC）是世界范围内导致死亡的主要原因之一。本文提出了一种自动诊断技术来检测、定位和分类结肠镜检查视频帧中的息肉。该模型采用深度YOLOv4模型，将空间信息和上下文信息分别以空间注意和通道注意块的形式融合在一起，以更好地定位息肉。最后，利用深度和手工特征的融合，将检测到的息肉分类为腺瘤或非腺瘤。息肉的形状和质地是判别息肉类型的基本特征。因此，提出的工作利用定向梯度的金字塔直方图（PHOG）和通过三重连体结构学习的嵌入特征来提取这些特征。PHOG提取每个息肉类的局部形状信息，而Siamese网络提取息肉内的区分特征。在两个数据库上的单独和跨数据库性能表明了我们的方法在息肉定位中的鲁棒性。基于重要临床参数与当前最先进方法的竞争分析证实，我们的方法可用于实时和离线结肠镜视频帧的自动息肉定位。对于Kvasir-SEG和SUN数据库，该方法的平均精度分别为0.8971和0.9171，F1分数分别为0.8869和0.8812。同样，对于检测到的息肉，所提出的分类框架在公开可用的UCI结肠镜视频数据集上的分类准确率为96.66%。此外，该分类框架的F1得分为96.54%，验证了该框架在息肉定位和分类方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical & Biological Engineering & Computing 医学-工程：生物医学

CiteScore

6.00

自引率

3.10%

发文量

249

审稿时长

3.5 months

期刊介绍： Founded in 1963, Medical & Biological Engineering & Computing (MBEC) continues to serve the biomedical engineering community, covering the entire spectrum of biomedical and clinical engineering. The journal presents exciting and vital experimental and theoretical developments in biomedical science and technology, and reports on advances in computer-based methodologies in these multidisciplinary subjects. The journal also incorporates new and evolving technologies including cellular engineering and molecular imaging. MBEC publishes original research articles as well as reviews and technical notes. Its Rapid Communications category focuses on material of immediate value to the readership, while the Controversies section provides a forum to exchange views on selected issues, stimulating a vigorous and informed debate in this exciting and high profile field. MBEC is an official journal of the International Federation of Medical and Biological Engineering (IFMBE).