Artificial intelligence for the diagnosis of Helicobacter pylori infection in endoscopic and pathological tissues images: A systematic review and meta-analysis
Yuting Wen , Yao Huang , Yu Liu , Shasha Zhang , Zhe Liu , Chan Hui , Yi Wang
{"title":"Artificial intelligence for the diagnosis of Helicobacter pylori infection in endoscopic and pathological tissues images: A systematic review and meta-analysis","authors":"Yuting Wen , Yao Huang , Yu Liu , Shasha Zhang , Zhe Liu , Chan Hui , Yi Wang","doi":"10.1016/j.ibmed.2025.100244","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>In recent years, artificial intelligence (AI) algorithms, including deep learning, have shown remarkable progress in image-recognition tasks. This study aimed to evaluate the diagnostic performance of AI in diagnosing Helicobacter pylori (H. pylori) infection using endoscopic and pathological images.</div></div><div><h3>Methods</h3><div>A literature search was conducted across multiple databases to identify all primary studies related to the diagnostic performance of AI algorithms for H. pylori infection published before 2024. True positive (TP), false positive (FP), false negative (FN), and true negative (TN) values were extracted or calculated for each study. Pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), precision-recall (PR), and diagnostic odds ratio (DOR) were calculated. A summary receiver operating characteristic curve (SROC) was used to assess overall diagnostic performance.</div></div><div><h3>Results</h3><div>Twelve studies were included in the final analysis. The pooled sensitivity was 0.87 (95 % CI 0.78–0.92), pooled specificity was 0.79 (95 % CI 0.54–0.92), pooled PLR was 4.1 (95 % CI 1.7–9.8), and pooled NLR was 0.17 (95 % CI 0.10–0.29). The DOR was 24 (95 % CI 7–78), and the SROC was 0.90 (95 % CI 0.87–0.92). Substantial heterogeneity was observed among the studies (sensitivity: I<sup>2</sup> = 90.50 %, 95 % CI 86.37–94.62; specificity: I<sup>2</sup> = 98.66 %, 95 % CI 98.34–98.97). Deek's funnel plot indicated low publication bias (P = 0.89).</div></div><div><h3>Conclusions</h3><div>AI algorithms show potential in diagnosing HP infection by improving accuracy and lesion detection. However, due to heterogeneity in study results, more comprehensive clinical validation is needed before widespread application. Future research should focus on multicenter validation, standardized datasets, integration into clinical workflows, and addressing data privacy and ethics to promote broader use of AI in HP diagnosis.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100244"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521225000481","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background
In recent years, artificial intelligence (AI) algorithms, including deep learning, have shown remarkable progress in image-recognition tasks. This study aimed to evaluate the diagnostic performance of AI in diagnosing Helicobacter pylori (H. pylori) infection using endoscopic and pathological images.
Methods
A literature search was conducted across multiple databases to identify all primary studies related to the diagnostic performance of AI algorithms for H. pylori infection published before 2024. True positive (TP), false positive (FP), false negative (FN), and true negative (TN) values were extracted or calculated for each study. Pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), precision-recall (PR), and diagnostic odds ratio (DOR) were calculated. A summary receiver operating characteristic curve (SROC) was used to assess overall diagnostic performance.
Results
Twelve studies were included in the final analysis. The pooled sensitivity was 0.87 (95 % CI 0.78–0.92), pooled specificity was 0.79 (95 % CI 0.54–0.92), pooled PLR was 4.1 (95 % CI 1.7–9.8), and pooled NLR was 0.17 (95 % CI 0.10–0.29). The DOR was 24 (95 % CI 7–78), and the SROC was 0.90 (95 % CI 0.87–0.92). Substantial heterogeneity was observed among the studies (sensitivity: I2 = 90.50 %, 95 % CI 86.37–94.62; specificity: I2 = 98.66 %, 95 % CI 98.34–98.97). Deek's funnel plot indicated low publication bias (P = 0.89).
Conclusions
AI algorithms show potential in diagnosing HP infection by improving accuracy and lesion detection. However, due to heterogeneity in study results, more comprehensive clinical validation is needed before widespread application. Future research should focus on multicenter validation, standardized datasets, integration into clinical workflows, and addressing data privacy and ethics to promote broader use of AI in HP diagnosis.