Harriet Evans, Naveen Sivakumar, Shivam Bhanderi, Simon Graham, David Snead, Abhilasha Patel, Andrew Robinson
{"title":"Evaluating the pathological and clinical implications of errors made by an artificial intelligence colon biopsy screening tool.","authors":"Harriet Evans, Naveen Sivakumar, Shivam Bhanderi, Simon Graham, David Snead, Abhilasha Patel, Andrew Robinson","doi":"10.1136/bmjgast-2024-001649","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Artificial intelligence (AI) tools for histological diagnosis offer great potential to healthcare, yet failure to understand their clinical context is delaying adoption. IGUANA (Interpretable Gland-Graphs using a Neural Aggregator) is an AI algorithm that can effectively classify colonic biopsies into normal versus abnormal categories, designed to automatically report normal cases. We performed a retrospective pathological and clinical review of the errors made by IGUANA.</p><p><strong>Methods: </strong>False negative (FN) errors were the primary focus due to the greatest propensity for harm. Pathological evaluation involved assessment of whole slide image (WSI) quality, precise diagnoses for each missed entity and identification of factors impeding diagnosis. Clinical evaluation scored the impact of each error on the patient and detailed the type of impact in terms of missed diagnosis, investigations or treatment.</p><p><strong>Results: </strong>Across 5054 WSIs from 2080 UK National Health Service patients there were 220 FN errors across 164 cases (4.4% of WSI, 7.9% of cases). Diagnostic errors varied from cases of adenocarcinoma to mild inflammation. 88.4% of FN errors would have no impact on patient care, with only one error causing major patient harm. Factors that protected against harm included biopsies being low-risk polyps or diagnostic features were detected in other biopsies.</p><p><strong>Conclusion: </strong>Most FN errors would not result in patient harm, suggesting that even with a 7.9% case-level error rate, this AI tool might be more suitable for adoption than statistics portray. Consideration of the clinical context of AI tool errors is essential to facilitate safe implementation.</p>","PeriodicalId":9235,"journal":{"name":"BMJ Open Gastroenterology","volume":"12 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11749196/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Open Gastroenterology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjgast-2024-001649","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: Artificial intelligence (AI) tools for histological diagnosis offer great potential to healthcare, yet failure to understand their clinical context is delaying adoption. IGUANA (Interpretable Gland-Graphs using a Neural Aggregator) is an AI algorithm that can effectively classify colonic biopsies into normal versus abnormal categories, designed to automatically report normal cases. We performed a retrospective pathological and clinical review of the errors made by IGUANA.
Methods: False negative (FN) errors were the primary focus due to the greatest propensity for harm. Pathological evaluation involved assessment of whole slide image (WSI) quality, precise diagnoses for each missed entity and identification of factors impeding diagnosis. Clinical evaluation scored the impact of each error on the patient and detailed the type of impact in terms of missed diagnosis, investigations or treatment.
Results: Across 5054 WSIs from 2080 UK National Health Service patients there were 220 FN errors across 164 cases (4.4% of WSI, 7.9% of cases). Diagnostic errors varied from cases of adenocarcinoma to mild inflammation. 88.4% of FN errors would have no impact on patient care, with only one error causing major patient harm. Factors that protected against harm included biopsies being low-risk polyps or diagnostic features were detected in other biopsies.
Conclusion: Most FN errors would not result in patient harm, suggesting that even with a 7.9% case-level error rate, this AI tool might be more suitable for adoption than statistics portray. Consideration of the clinical context of AI tool errors is essential to facilitate safe implementation.
期刊介绍:
BMJ Open Gastroenterology is an online-only, peer-reviewed, open access gastroenterology journal, dedicated to publishing high-quality medical research from all disciplines and therapeutic areas of gastroenterology. It is the open access companion journal of Gut and is co-owned by the British Society of Gastroenterology. The journal publishes all research study types, from study protocols to phase I trials to meta-analyses, including small or specialist studies. Publishing procedures are built around continuous publication, publishing research online as soon as the article is ready.