Evaluating the pathological and clinical implications of errors made by an artificial intelligence colon biopsy screening tool.

IF 2.9 Q2 GASTROENTEROLOGY & HEPATOLOGY

BMJ Open Gastroenterology Pub Date : 2025-01-06 DOI:10.1136/bmjgast-2024-001649

Harriet Evans, Naveen Sivakumar, Shivam Bhanderi, Simon Graham, David Snead, Abhilasha Patel, Andrew Robinson

{"title":"Evaluating the pathological and clinical implications of errors made by an artificial intelligence colon biopsy screening tool.","authors":"Harriet Evans, Naveen Sivakumar, Shivam Bhanderi, Simon Graham, David Snead, Abhilasha Patel, Andrew Robinson","doi":"10.1136/bmjgast-2024-001649","DOIUrl":null,"url":null,"abstract":"Objective: Artificial intelligence (AI) tools for histological diagnosis offer great potential to healthcare, yet failure to understand their clinical context is delaying adoption. IGUANA (Interpretable Gland-Graphs using a Neural Aggregator) is an AI algorithm that can effectively classify colonic biopsies into normal versus abnormal categories, designed to automatically report normal cases. We performed a retrospective pathological and clinical review of the errors made by IGUANA.Methods: False negative (FN) errors were the primary focus due to the greatest propensity for harm. Pathological evaluation involved assessment of whole slide image (WSI) quality, precise diagnoses for each missed entity and identification of factors impeding diagnosis. Clinical evaluation scored the impact of each error on the patient and detailed the type of impact in terms of missed diagnosis, investigations or treatment.Results: Across 5054 WSIs from 2080 UK National Health Service patients there were 220 FN errors across 164 cases (4.4% of WSI, 7.9% of cases). Diagnostic errors varied from cases of adenocarcinoma to mild inflammation. 88.4% of FN errors would have no impact on patient care, with only one error causing major patient harm. Factors that protected against harm included biopsies being low-risk polyps or diagnostic features were detected in other biopsies.Conclusion: Most FN errors would not result in patient harm, suggesting that even with a 7.9% case-level error rate, this AI tool might be more suitable for adoption than statistics portray. Consideration of the clinical context of AI tool errors is essential to facilitate safe implementation.","PeriodicalId":9235,"journal":{"name":"BMJ Open Gastroenterology","volume":"12 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11749196/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Open Gastroenterology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjgast-2024-001649","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: Artificial intelligence (AI) tools for histological diagnosis offer great potential to healthcare, yet failure to understand their clinical context is delaying adoption. IGUANA (Interpretable Gland-Graphs using a Neural Aggregator) is an AI algorithm that can effectively classify colonic biopsies into normal versus abnormal categories, designed to automatically report normal cases. We performed a retrospective pathological and clinical review of the errors made by IGUANA.

Methods: False negative (FN) errors were the primary focus due to the greatest propensity for harm. Pathological evaluation involved assessment of whole slide image (WSI) quality, precise diagnoses for each missed entity and identification of factors impeding diagnosis. Clinical evaluation scored the impact of each error on the patient and detailed the type of impact in terms of missed diagnosis, investigations or treatment.

Results: Across 5054 WSIs from 2080 UK National Health Service patients there were 220 FN errors across 164 cases (4.4% of WSI, 7.9% of cases). Diagnostic errors varied from cases of adenocarcinoma to mild inflammation. 88.4% of FN errors would have no impact on patient care, with only one error causing major patient harm. Factors that protected against harm included biopsies being low-risk polyps or diagnostic features were detected in other biopsies.

Conclusion: Most FN errors would not result in patient harm, suggesting that even with a 7.9% case-level error rate, this AI tool might be more suitable for adoption than statistics portray. Consideration of the clinical context of AI tool errors is essential to facilitate safe implementation.

Abstract Image

查看原文本刊更多论文

评估人工智能结肠活检筛查工具错误的病理和临床意义。

目的：用于组织学诊断的人工智能（AI）工具为医疗保健提供了巨大的潜力，但未能了解其临床背景正在推迟采用。IGUANA（使用神经聚合器的可解释腺体图）是一种人工智能算法，可以有效地将结肠活检分为正常和异常类别，旨在自动报告正常病例。我们对IGUANA所犯的错误进行了回顾性病理和临床回顾。方法：假阴性（FN）错误是主要的焦点，因为最大的危害倾向。病理评价包括对全切片图像（WSI）质量的评估，对每一个缺失实体的精确诊断以及对阻碍诊断的因素的识别。临床评估对每个错误对患者的影响进行评分，并详细说明错过诊断、调查或治疗的影响类型。结果：在来自2080名英国国民健康服务患者的5054名WSI中，164例患者中有220例FN错误（WSI的4.4%，病例的7.9%）。诊断错误从腺癌到轻度炎症不等。88.4%的FN错误对患者护理没有影响，只有一个错误对患者造成重大伤害。保护免受伤害的因素包括活组织检查是低风险息肉或在其他活组织检查中发现诊断特征。结论：大多数FN错误不会对患者造成伤害，这表明即使有7.9%的病例级错误率，这种人工智能工具可能比统计数据所描绘的更适合采用。考虑人工智能工具错误的临床背景对于促进安全实施至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMJ Open Gastroenterology GASTROENTEROLOGY & HEPATOLOGY-

CiteScore

5.90

自引率

3.20%

发文量

审稿时长

2 weeks

期刊介绍： BMJ Open Gastroenterology is an online-only, peer-reviewed, open access gastroenterology journal, dedicated to publishing high-quality medical research from all disciplines and therapeutic areas of gastroenterology. It is the open access companion journal of Gut and is co-owned by the British Society of Gastroenterology. The journal publishes all research study types, from study protocols to phase I trials to meta-analyses, including small or specialist studies. Publishing procedures are built around continuous publication, publishing research online as soon as the article is ready.