Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays Pub Date : 2024-11-06 DOI:10.1016/j.displa.2024.102874

Faruk Enes Oğuz , Ahmet Alkan

{"title":"Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI","authors":"Faruk Enes Oğuz , Ahmet Alkan","doi":"10.1016/j.displa.2024.102874","DOIUrl":null,"url":null,"abstract":"<div><div>Gastrointestinal diseases are significant health issues worldwide, requiring early diagnosis due to their serious health implications. Therefore, detecting these diseases using artificial intelligence-based medical decision support systems through colonoscopy images plays a critical role in early diagnosis. In this study, a deep learning-based method is proposed for the classification of gastrointestinal diseases and colon anatomical landmarks using colonoscopy images. For this purpose, five different Convolutional Neural Network (CNN) models, namely Xception, ResNet-101, NASNet-Large, EfficientNet, and NASNet-Mobile, were trained. An ensemble model was created using class-based recall values derived from the validation performances of the top three models (Xception, ResNet-101, NASNet-Large). A user-friendly Graphical User Interface (GUI) was developed, allowing users to perform classification tasks and use Gradient-weighted Class Activation Mapping (Grad-CAM), an explainable AI tool, to visualize the regions from which the model derives information. Grad-CAM visualizations contribute to a better understanding of the model’s decision-making processes and play an important role in the application of explainable AI. In the study, eight labels, including anatomical markers such as z-line, pylorus, and cecum, as well as pathological findings like esophagitis, polyps, and ulcerative colitis, were classified using the KVASIR V2 dataset. The proposed ensemble model achieved a 94.125% accuracy on the KVASIR V2 dataset, demonstrating competitive performance compared to similar studies in the literature. Additionally, the precision and F1 score values of this model are equal to 94.168% and 94.125%, respectively. These results suggest that the proposed method provides an effective solution for the diagnosis of GI diseases and can be beneficial for medical education.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102874"},"PeriodicalIF":3.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224002385","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Gastrointestinal diseases are significant health issues worldwide, requiring early diagnosis due to their serious health implications. Therefore, detecting these diseases using artificial intelligence-based medical decision support systems through colonoscopy images plays a critical role in early diagnosis. In this study, a deep learning-based method is proposed for the classification of gastrointestinal diseases and colon anatomical landmarks using colonoscopy images. For this purpose, five different Convolutional Neural Network (CNN) models, namely Xception, ResNet-101, NASNet-Large, EfficientNet, and NASNet-Mobile, were trained. An ensemble model was created using class-based recall values derived from the validation performances of the top three models (Xception, ResNet-101, NASNet-Large). A user-friendly Graphical User Interface (GUI) was developed, allowing users to perform classification tasks and use Gradient-weighted Class Activation Mapping (Grad-CAM), an explainable AI tool, to visualize the regions from which the model derives information. Grad-CAM visualizations contribute to a better understanding of the model’s decision-making processes and play an important role in the application of explainable AI. In the study, eight labels, including anatomical markers such as z-line, pylorus, and cecum, as well as pathological findings like esophagitis, polyps, and ulcerative colitis, were classified using the KVASIR V2 dataset. The proposed ensemble model achieved a 94.125% accuracy on the KVASIR V2 dataset, demonstrating competitive performance compared to similar studies in the literature. Additionally, the precision and F1 score values of this model are equal to 94.168% and 94.125%, respectively. These results suggest that the proposed method provides an effective solution for the diagnosis of GI diseases and can be beneficial for medical education.

查看原文本刊更多论文

利用可解释人工智能辅助加权集合深度学习方法对结肠镜图像中的胃肠道疾病进行分类

胃肠道疾病是世界范围内的重大健康问题，因其对健康的严重影响而需要早期诊断。因此，利用基于人工智能的医疗决策支持系统通过结肠镜图像检测这些疾病在早期诊断中发挥着至关重要的作用。本研究提出了一种基于深度学习的方法，利用结肠镜图像对胃肠道疾病和结肠解剖地标进行分类。为此，我们训练了五个不同的卷积神经网络（CNN）模型，即 Xception、ResNet-101、NASNet-Large、EfficientNet 和 NASNet-Mobile。根据前三个模型（Xception、ResNet-101、NASNet-Large）的验证性能得出的基于类的召回值，创建了一个集合模型。我们开发了一个用户友好型图形用户界面（GUI），允许用户执行分类任务，并使用梯度加权类激活映射（Grad-CAM）这一可解释的人工智能工具来可视化模型从中获取信息的区域。Grad-CAM 可视化有助于更好地理解模型的决策过程，并在可解释人工智能的应用中发挥重要作用。在这项研究中，利用 KVASIR V2 数据集对八个标签进行了分类，包括 Z 线、幽门和盲肠等解剖标记以及食管炎、息肉和溃疡性结肠炎等病理结果。所提出的集合模型在 KVASIR V2 数据集上达到了 94.125% 的准确率，与文献中的类似研究相比，表现出了很强的竞争力。此外，该模型的精确度和 F1 分数分别为 94.168% 和 94.125%。这些结果表明，所提出的方法为消化道疾病的诊断提供了有效的解决方案，并可用于医学教育。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.