Khaled Eabne Delowar , Mohammed Borhan Uddin , Md Khaliluzzaman , Riadul Islam Rabbi , Md Jakir Hossen , M. Moazzam Hossen
{"title":"PolyNet: A self-attention based CNN model for classifying the colon polyp from colonoscopy image","authors":"Khaled Eabne Delowar , Mohammed Borhan Uddin , Md Khaliluzzaman , Riadul Islam Rabbi , Md Jakir Hossen , M. Moazzam Hossen","doi":"10.1016/j.imu.2025.101654","DOIUrl":null,"url":null,"abstract":"<div><div>Colon polyps are small, precancerous growths in the colon that can indicate colorectal cancer (CRC), a disease that has a significant impact on public health. A colonoscopy is a medical procedure that helps detect colon polyps. However, the manual examination for identifying the type of polyps can be time-consuming, tedious, and prone to human error. Automatic classification of polyps through colonoscopy images can be more efficient. However, there are currently no specialized methods for the classification of polyps from colonoscopy; however, several state-of-the-art CNN models can classify polyps. We are introducing a new CNN-based model called PolyNet, a model that shows the best accuracy of the polyps classification from the multiple models and which also performs better than pre-trained models such as VGG16, ResNet50, DenseNetV3, MobileNetV3, and InceptionV3, as well as nine other customized CNN-based models for classification. This study provides a sensitivity analysis to demonstrate how slight modifications in the network's architecture can impact the balance between accuracy and performance. We examined different CNN architectures and developed a good convolutional neural network (CNN) model for correctly predicting colon polyps using the Kvasir dataset. The self-attention mechanism is incorporated in the best CNN model, i.e., PolypNet, to ensure better accuracy. To compare, DenseNetV3, MobileNet-V3, Inception-V3, VGG16, and ResNet50 get 73.87 %, 69.38 %, 61.12 %, 84.00 %, and 86.12 % of accuracy on the Kvasir dataset, while PolypNet with attention archives 86 % accuracy, 86 % precision, 85 % recall, and an 86 % F1-score.</div></div>","PeriodicalId":13953,"journal":{"name":"Informatics in Medicine Unlocked","volume":"56 ","pages":"Article 101654"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics in Medicine Unlocked","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352914825000425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Colon polyps are small, precancerous growths in the colon that can indicate colorectal cancer (CRC), a disease that has a significant impact on public health. A colonoscopy is a medical procedure that helps detect colon polyps. However, the manual examination for identifying the type of polyps can be time-consuming, tedious, and prone to human error. Automatic classification of polyps through colonoscopy images can be more efficient. However, there are currently no specialized methods for the classification of polyps from colonoscopy; however, several state-of-the-art CNN models can classify polyps. We are introducing a new CNN-based model called PolyNet, a model that shows the best accuracy of the polyps classification from the multiple models and which also performs better than pre-trained models such as VGG16, ResNet50, DenseNetV3, MobileNetV3, and InceptionV3, as well as nine other customized CNN-based models for classification. This study provides a sensitivity analysis to demonstrate how slight modifications in the network's architecture can impact the balance between accuracy and performance. We examined different CNN architectures and developed a good convolutional neural network (CNN) model for correctly predicting colon polyps using the Kvasir dataset. The self-attention mechanism is incorporated in the best CNN model, i.e., PolypNet, to ensure better accuracy. To compare, DenseNetV3, MobileNet-V3, Inception-V3, VGG16, and ResNet50 get 73.87 %, 69.38 %, 61.12 %, 84.00 %, and 86.12 % of accuracy on the Kvasir dataset, while PolypNet with attention archives 86 % accuracy, 86 % precision, 85 % recall, and an 86 % F1-score.
期刊介绍:
Informatics in Medicine Unlocked (IMU) is an international gold open access journal covering a broad spectrum of topics within medical informatics, including (but not limited to) papers focusing on imaging, pathology, teledermatology, public health, ophthalmological, nursing and translational medicine informatics. The full papers that are published in the journal are accessible to all who visit the website.