基于供应商不确定视觉转换器的人工智能用于经口胆道镜检查：与卷积神经网络和内窥镜医师比较胆道狭窄的诊断性能。

IF 4.7

Digestive endoscopy : official journal of the Japan Gastroenterological Endoscopy Society Pub Date : 2025-09-03 DOI:10.1111/den.70028

Ryosuke Sato, Kazuyuki Matsumoto, Masahiro Tomiya, Takayoshi Tanimoto, Akimitsu Ohto, Kentaro Oki, Satoshi Kajitani, Tatsuya Kikuchi, Akihiro Matsumi, Kazuya Miyamoto, Yuki Fujii, Daisuke Uchida, Koichiro Tsutsumi, Shigeru Horiguchi, Yoshiro Kawahara, Motoyuki Otsuka

{"title":"基于供应商不确定视觉转换器的人工智能用于经口胆道镜检查：与卷积神经网络和内窥镜医师比较胆道狭窄的诊断性能。","authors":"Ryosuke Sato, Kazuyuki Matsumoto, Masahiro Tomiya, Takayoshi Tanimoto, Akimitsu Ohto, Kentaro Oki, Satoshi Kajitani, Tatsuya Kikuchi, Akihiro Matsumi, Kazuya Miyamoto, Yuki Fujii, Daisuke Uchida, Koichiro Tsutsumi, Shigeru Horiguchi, Yoshiro Kawahara, Motoyuki Otsuka","doi":"10.1111/den.70028","DOIUrl":null,"url":null,"abstract":"Objectives: Accurate diagnosis of biliary strictures remains challenging. This study aimed to develop an artificial intelligence (AI) system for peroral cholangioscopy (POCS) using a Vision Transformer (ViT) architecture and to evaluate its performance compared to different vendor devices, conventional convolutional neural networks (CNNs), and endoscopists.Methods: We retrospectively analyzed 125 patients with indeterminate biliary strictures who underwent POCS between 2012 and 2024. AI models including the ViT architecture and two established CNN architectures were developed using images from CHF-B260 or B290 (CHF group; Olympus Medical) and SpyScope DS or DS II (Spy group; Boston Scientific) systems via a patient-level, 3-fold cross-validation. For a direct comparison against endoscopists, a balanced 440-image test set, containing an equal number of images from each vendor, was used for a blinded evaluation.Results: The 3-fold cross-validation on the entire 2062-image dataset yielded a robust accuracy of 83.9% (95% confidence interval (CI), 80.9-86.7) for the ViT model. The model's accuracy was consistent between CHF (82.7%) and Spy (86.8%, p = 0.198) groups, and its performance was comparable to the evaluated conventional CNNs. On the 440-image test set, the ViT's accuracy of 78.4% (95% CI, 72.5-83.8) was comparable to that of expert endoscopists (82.0%, p = 0.148) and non-experts (73.0%, p = 0.066), with no statistically significant differences observed.Conclusions: The novel ViT-based AI model demonstrated high vendor-agnostic diagnostic accuracy across multiple POCS systems, achieving performance comparable to conventional CNNs and endoscopists evaluated in this study.","PeriodicalId":72813,"journal":{"name":"Digestive endoscopy : official journal of the Japan Gastroenterological Endoscopy Society","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vendor-Agnostic Vision Transformer-Based Artificial Intelligence for Peroral Cholangioscopy: Diagnostic Performance in Biliary Strictures Compared With Convolutional Neural Networks and Endoscopists.\",\"authors\":\"Ryosuke Sato, Kazuyuki Matsumoto, Masahiro Tomiya, Takayoshi Tanimoto, Akimitsu Ohto, Kentaro Oki, Satoshi Kajitani, Tatsuya Kikuchi, Akihiro Matsumi, Kazuya Miyamoto, Yuki Fujii, Daisuke Uchida, Koichiro Tsutsumi, Shigeru Horiguchi, Yoshiro Kawahara, Motoyuki Otsuka\",\"doi\":\"10.1111/den.70028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objectives: Accurate diagnosis of biliary strictures remains challenging. This study aimed to develop an artificial intelligence (AI) system for peroral cholangioscopy (POCS) using a Vision Transformer (ViT) architecture and to evaluate its performance compared to different vendor devices, conventional convolutional neural networks (CNNs), and endoscopists.Methods: We retrospectively analyzed 125 patients with indeterminate biliary strictures who underwent POCS between 2012 and 2024. AI models including the ViT architecture and two established CNN architectures were developed using images from CHF-B260 or B290 (CHF group; Olympus Medical) and SpyScope DS or DS II (Spy group; Boston Scientific) systems via a patient-level, 3-fold cross-validation. For a direct comparison against endoscopists, a balanced 440-image test set, containing an equal number of images from each vendor, was used for a blinded evaluation.Results: The 3-fold cross-validation on the entire 2062-image dataset yielded a robust accuracy of 83.9% (95% confidence interval (CI), 80.9-86.7) for the ViT model. The model's accuracy was consistent between CHF (82.7%) and Spy (86.8%, p = 0.198) groups, and its performance was comparable to the evaluated conventional CNNs. On the 440-image test set, the ViT's accuracy of 78.4% (95% CI, 72.5-83.8) was comparable to that of expert endoscopists (82.0%, p = 0.148) and non-experts (73.0%, p = 0.066), with no statistically significant differences observed.Conclusions: The novel ViT-based AI model demonstrated high vendor-agnostic diagnostic accuracy across multiple POCS systems, achieving performance comparable to conventional CNNs and endoscopists evaluated in this study.\",\"PeriodicalId\":72813,\"journal\":{\"name\":\"Digestive endoscopy : official journal of the Japan Gastroenterological Endoscopy Society\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2025-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digestive endoscopy : official journal of the Japan Gastroenterological Endoscopy Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1111/den.70028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digestive endoscopy : official journal of the Japan Gastroenterological Endoscopy Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/den.70028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目的：准确诊断胆道狭窄仍然具有挑战性。本研究旨在利用Vision Transformer （ViT）架构开发一种用于经口胆管镜检查（POCS）的人工智能（AI）系统，并将其与不同供应商设备、传统卷积神经网络（cnn）和内窥镜医师进行比较，评估其性能。方法：我们回顾性分析了2012年至2024年间125例接受POCS治疗的不确定胆道狭窄患者。AI模型包括ViT架构和两个已建立的CNN架构，使用来自CHF- b260或B290 （CHF组；Olympus Medical）和SpyScope DS或DS II （Spy组；Boston Scientific）系统的图像，通过患者层面的3倍交叉验证进行开发。为了与内窥镜医师进行直接比较，使用平衡的440张图像测试集，其中包含来自每个供应商的相同数量的图像，用于盲法评估。结果：对整个2062张图像数据集进行3倍交叉验证，ViT模型的鲁棒精度为83.9%（95%置信区间（CI）， 80.9-86.7）。模型的准确率在CHF组（82.7%）和Spy组（86.8%,p = 0.198）之间保持一致，其性能与评估的传统cnn相当。在440张图像的测试集上，ViT的准确率为78.4% (95% CI, 72.5 ~ 83.8)，与内窥镜专家（82.0%,p = 0.148）和非专家（73.0%,p = 0.066）相当，差异无统计学意义。结论：基于vit的新型人工智能模型在多个POCS系统中显示出较高的供应商不可知诊断准确性，其性能可与本研究中评估的传统cnn和内窥镜师相媲美。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Vendor-Agnostic Vision Transformer-Based Artificial Intelligence for Peroral Cholangioscopy: Diagnostic Performance in Biliary Strictures Compared With Convolutional Neural Networks and Endoscopists.

Objectives: Accurate diagnosis of biliary strictures remains challenging. This study aimed to develop an artificial intelligence (AI) system for peroral cholangioscopy (POCS) using a Vision Transformer (ViT) architecture and to evaluate its performance compared to different vendor devices, conventional convolutional neural networks (CNNs), and endoscopists.

Methods: We retrospectively analyzed 125 patients with indeterminate biliary strictures who underwent POCS between 2012 and 2024. AI models including the ViT architecture and two established CNN architectures were developed using images from CHF-B260 or B290 (CHF group; Olympus Medical) and SpyScope DS or DS II (Spy group; Boston Scientific) systems via a patient-level, 3-fold cross-validation. For a direct comparison against endoscopists, a balanced 440-image test set, containing an equal number of images from each vendor, was used for a blinded evaluation.

Results: The 3-fold cross-validation on the entire 2062-image dataset yielded a robust accuracy of 83.9% (95% confidence interval (CI), 80.9-86.7) for the ViT model. The model's accuracy was consistent between CHF (82.7%) and Spy (86.8%, p = 0.198) groups, and its performance was comparable to the evaluated conventional CNNs. On the 440-image test set, the ViT's accuracy of 78.4% (95% CI, 72.5-83.8) was comparable to that of expert endoscopists (82.0%, p = 0.148) and non-experts (73.0%, p = 0.066), with no statistically significant differences observed.

Conclusions: The novel ViT-based AI model demonstrated high vendor-agnostic diagnostic accuracy across multiple POCS systems, achieving performance comparable to conventional CNNs and endoscopists evaluated in this study.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Digestive endoscopy : official journal of the Japan Gastroenterological Endoscopy Society

自引率

0.00%

发文量