{"title":"Deep learning model for gastrointestinal polyp segmentation.","authors":"Zitong Wang, Zeyi Wang, Pengyu Sun","doi":"10.7717/peerj-cs.2924","DOIUrl":null,"url":null,"abstract":"<p><p>One of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity and variability of readers' interpretations of polyps. In this work, we introduce a novel deep learning architecture for gastrointestinal polyp segmentation in the Kvasir-SEG dataset. Our method employs an encoder-decoder structure with a pre-trained ConvNeXt model as the encoder to learn multi-scale feature representations. The feature maps are passed through a ConvNeXt Block and then through a decoder network consisting of three decoder blocks. Our key contribution is the employment of a cross-attention mechanism that creates shortcut connections between the decoder and encoder to maximize feature retention and reduce information loss. In addition, we introduce a Residual Transformer Block in the decoder that learns long-term dependency by using self-attention mechanisms and enhance feature representations. We evaluate our model on the Kvasir-SEG dataset, achieving a Dice coefficient of 0.8715 and mean intersection over union (mIoU) of 0.8021. Our methodology demonstrates state-of-the-art performance in gastrointestinal polyp segmentation and its feasibility of being used as part of clinical pipelines to assist with automated detection and diagnosis of polyps.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2924"},"PeriodicalIF":3.5000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12192692/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2924","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
One of the biggest hazards to cancer-related mortality globally is colorectal cancer, and improved patient outcomes are greatly influenced by early identification. Colonoscopy is a highly effective screening method, yet segmentation and detection remain challenging aspects due to the heterogeneity and variability of readers' interpretations of polyps. In this work, we introduce a novel deep learning architecture for gastrointestinal polyp segmentation in the Kvasir-SEG dataset. Our method employs an encoder-decoder structure with a pre-trained ConvNeXt model as the encoder to learn multi-scale feature representations. The feature maps are passed through a ConvNeXt Block and then through a decoder network consisting of three decoder blocks. Our key contribution is the employment of a cross-attention mechanism that creates shortcut connections between the decoder and encoder to maximize feature retention and reduce information loss. In addition, we introduce a Residual Transformer Block in the decoder that learns long-term dependency by using self-attention mechanisms and enhance feature representations. We evaluate our model on the Kvasir-SEG dataset, achieving a Dice coefficient of 0.8715 and mean intersection over union (mIoU) of 0.8021. Our methodology demonstrates state-of-the-art performance in gastrointestinal polyp segmentation and its feasibility of being used as part of clinical pipelines to assist with automated detection and diagnosis of polyps.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.