Pradipta Sasmal, Susant Kumar Panigrahi, Swarna Laxmi Panda, M K Bhuyan
{"title":"Attention-guided deep framework for polyp localization and subsequent classification via polyp local and Siamese feature fusion.","authors":"Pradipta Sasmal, Susant Kumar Panigrahi, Swarna Laxmi Panda, M K Bhuyan","doi":"10.1007/s11517-025-03369-z","DOIUrl":null,"url":null,"abstract":"<p><p>Colorectal cancer (CRC) is one of the leading causes of death worldwide. This paper proposes an automated diagnostic technique to detect, localize, and classify polyps in colonoscopy video frames. The proposed model adopts the deep YOLOv4 model that incorporates both spatial and contextual information in the form of spatial attention and channel attention blocks, respectively for better localization of polyps. Finally, leveraging a fusion of deep and handcrafted features, the detected polyps are classified as adenoma or non-adenoma. Polyp shape and texture are essential features in discriminating polyp types. Therefore, the proposed work utilizes a pyramid histogram of oriented gradient (PHOG) and embedding features learned via triplet Siamese architecture to extract these features. The PHOG extracts local shape information from each polyp class, whereas the Siamese network extracts intra-polyp discriminating features. The individual and cross-database performances on two databases suggest the robustness of our method in polyp localization. The competitive analysis based on significant clinical parameters with current state-of-the-art methods confirms that our method can be used for automated polyp localization in both real-time and offline colonoscopic video frames. Our method provides an average precision of 0.8971 and 0.9171 and an F1 score of 0.8869 and 0.8812 for the Kvasir-SEG and SUN databases. Similarly, the proposed classification framework for the detected polyps yields a classification accuracy of 96.66% on a publicly available UCI colonoscopy video dataset. Moreover, the classification framework provides an F1 score of 96.54% that validates the potential of the proposed framework in polyp localization and classification.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical & Biological Engineering & Computing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11517-025-03369-z","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Colorectal cancer (CRC) is one of the leading causes of death worldwide. This paper proposes an automated diagnostic technique to detect, localize, and classify polyps in colonoscopy video frames. The proposed model adopts the deep YOLOv4 model that incorporates both spatial and contextual information in the form of spatial attention and channel attention blocks, respectively for better localization of polyps. Finally, leveraging a fusion of deep and handcrafted features, the detected polyps are classified as adenoma or non-adenoma. Polyp shape and texture are essential features in discriminating polyp types. Therefore, the proposed work utilizes a pyramid histogram of oriented gradient (PHOG) and embedding features learned via triplet Siamese architecture to extract these features. The PHOG extracts local shape information from each polyp class, whereas the Siamese network extracts intra-polyp discriminating features. The individual and cross-database performances on two databases suggest the robustness of our method in polyp localization. The competitive analysis based on significant clinical parameters with current state-of-the-art methods confirms that our method can be used for automated polyp localization in both real-time and offline colonoscopic video frames. Our method provides an average precision of 0.8971 and 0.9171 and an F1 score of 0.8869 and 0.8812 for the Kvasir-SEG and SUN databases. Similarly, the proposed classification framework for the detected polyps yields a classification accuracy of 96.66% on a publicly available UCI colonoscopy video dataset. Moreover, the classification framework provides an F1 score of 96.54% that validates the potential of the proposed framework in polyp localization and classification.
期刊介绍:
Founded in 1963, Medical & Biological Engineering & Computing (MBEC) continues to serve the biomedical engineering community, covering the entire spectrum of biomedical and clinical engineering. The journal presents exciting and vital experimental and theoretical developments in biomedical science and technology, and reports on advances in computer-based methodologies in these multidisciplinary subjects. The journal also incorporates new and evolving technologies including cellular engineering and molecular imaging.
MBEC publishes original research articles as well as reviews and technical notes. Its Rapid Communications category focuses on material of immediate value to the readership, while the Controversies section provides a forum to exchange views on selected issues, stimulating a vigorous and informed debate in this exciting and high profile field.
MBEC is an official journal of the International Federation of Medical and Biological Engineering (IFMBE).