Evandros Kaklamanos, Kristjana Kristinsdottir, Matthew Wittbrodt, Meng Li, Panyavee Pitisuttithum, Wenjun Kou, Rajesh N Keswani, Dustin Carlson, Mozziyar Etemadi, John E Pandolfino
{"title":"From Pixels to Peristalsis: Comparing Achalasia Diagnosis with Artificial Intelligence to Expert Endoscopists.","authors":"Evandros Kaklamanos, Kristjana Kristinsdottir, Matthew Wittbrodt, Meng Li, Panyavee Pitisuttithum, Wenjun Kou, Rajesh N Keswani, Dustin Carlson, Mozziyar Etemadi, John E Pandolfino","doi":"10.1016/j.cgh.2025.09.034","DOIUrl":null,"url":null,"abstract":"<p><strong>Background & aims: </strong>Esophageal motility disorders, such as achalasia, require the use of multiple tests to establish a diagnosis. Patients are first assessed with upper endoscopy, but subtle disease indicators are often overlooked during the procedure leading to delayed diagnosis. Artificial intelligence (AI) has the potential to prevent these misdiagnoses and serve as an early screening tool to enhance the physician's decision-making.</p><p><strong>Methods: </strong>We developed a video-based transformer model capable of detecting achalasia from upper endoscopy videos. Videos were collected from 1,203 patients who presented with dysphagia between August 2018 and January 2024. The model performance was compared to a baseline of two expert physicians independently reviewing each video and consulting one another for a final consensus decision. A test set of 95 patients was used to evaluate the physician consensus and model performance based on ground truth labels obtained via high-resolution manometry (HRM) the Chicago Classification v4.0 (CCv4.0) scheme.</p><p><strong>Results: </strong>The model attained similar performance to the physician consensus on the test set, achieving an accuracy and F1 of 0.926 and 0.821, respectively, compared to the physicians (0.905, 0.710). The model also obtained fewer false negatives, achieving a sensitivity and negative predictive value (NPV) of 0.800 and 0.947 respectively compared to the physicians (0.550, 0.893). Furthermore, the model's attention mechanism emphasized clinically relevant features such as presence of fluid or food in the esophagus, and a tight lower esophageal sphincter (LES).</p><p><strong>Conclusion: </strong>These results indicate that AI can leverage upper endoscopy videos to detect achalasia with an accuracy comparable to that of expert endoscopists.</p>","PeriodicalId":10347,"journal":{"name":"Clinical Gastroenterology and Hepatology","volume":" ","pages":""},"PeriodicalIF":12.0000,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Gastroenterology and Hepatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.cgh.2025.09.034","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background & aims: Esophageal motility disorders, such as achalasia, require the use of multiple tests to establish a diagnosis. Patients are first assessed with upper endoscopy, but subtle disease indicators are often overlooked during the procedure leading to delayed diagnosis. Artificial intelligence (AI) has the potential to prevent these misdiagnoses and serve as an early screening tool to enhance the physician's decision-making.
Methods: We developed a video-based transformer model capable of detecting achalasia from upper endoscopy videos. Videos were collected from 1,203 patients who presented with dysphagia between August 2018 and January 2024. The model performance was compared to a baseline of two expert physicians independently reviewing each video and consulting one another for a final consensus decision. A test set of 95 patients was used to evaluate the physician consensus and model performance based on ground truth labels obtained via high-resolution manometry (HRM) the Chicago Classification v4.0 (CCv4.0) scheme.
Results: The model attained similar performance to the physician consensus on the test set, achieving an accuracy and F1 of 0.926 and 0.821, respectively, compared to the physicians (0.905, 0.710). The model also obtained fewer false negatives, achieving a sensitivity and negative predictive value (NPV) of 0.800 and 0.947 respectively compared to the physicians (0.550, 0.893). Furthermore, the model's attention mechanism emphasized clinically relevant features such as presence of fluid or food in the esophagus, and a tight lower esophageal sphincter (LES).
Conclusion: These results indicate that AI can leverage upper endoscopy videos to detect achalasia with an accuracy comparable to that of expert endoscopists.
期刊介绍:
Clinical Gastroenterology and Hepatology (CGH) is dedicated to offering readers a comprehensive exploration of themes in clinical gastroenterology and hepatology. Encompassing diagnostic, endoscopic, interventional, and therapeutic advances, the journal covers areas such as cancer, inflammatory diseases, functional gastrointestinal disorders, nutrition, absorption, and secretion.
As a peer-reviewed publication, CGH features original articles and scholarly reviews, ensuring immediate relevance to the practice of gastroenterology and hepatology. Beyond peer-reviewed content, the journal includes invited key reviews and articles on endoscopy/practice-based technology, health-care policy, and practice management. Multimedia elements, including images, video abstracts, and podcasts, enhance the reader's experience. CGH remains actively engaged with its audience through updates and commentary shared via platforms such as Facebook and Twitter.