Machine Learning Achieves Pathologist-Level Coeliac Disease Diagnosis.

NEJM AI Pub Date : 2025-03-27 DOI:10.1056/AIoa2400738
F Jaeckle, J Denholm, B Schreiber, S C Evans, M N Wicks, J Y H Chan, A C Bateman, S Natu, M J Arends, E Soilleux
{"title":"Machine Learning Achieves Pathologist-Level Coeliac Disease Diagnosis.","authors":"F Jaeckle, J Denholm, B Schreiber, S C Evans, M N Wicks, J Y H Chan, A C Bateman, S Natu, M J Arends, E Soilleux","doi":"10.1056/AIoa2400738","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The diagnosis of coeliac disease (CD), an autoimmune disorder with an estimated global prevalence of around 1%, generally relies on the histological examination of duodenal biopsies. However, inter-pathologist agreement for coeliac disease diagnosis is estimated to be no more than 80%. We aim to improve coeliac disease diagnosis by developing a novel, accurate, machine-learning-based diagnostic classifier.</p><p><strong>Methods: </strong>We present a machine learning model that diagnoses the presence or absence of coeliac disease from a set of duodenal biopsies representative of real-world clinical data. Our model was trained on a diverse dataset of 3,383 -slide images (WSIs) of H&E-stained duodenal biopsies from four hospitals featuring five different WSI scanners along with their clinical diagnoses. We trained our model using the multiple-instance-learning paradigm in a weakly-supervised manner with cross-validation and evaluated it on an independent test set featuring 644 unseen scans from a different regional NHS Trust. Additionally, we compared the model's predictions to independent diagnoses from four specialist pathologists on a subset of the test data.</p><p><strong>Results: </strong>Our model diagnosed coeliac disease in an independent test set from a previously unseen source with accuracy, sensitivity, and specificity exceeding 95% and an area under the ROC curve exceeding 99%. These results indicate that the model has the potential to outperform pathologists. In comparing the model's predictions to diagnoses on unseen test data from four independent pathologists, we found statistically indistinguishable results between pathologist-pathologist and pathologist-model inter-observer agreement (<i>p</i> > 96%).</p><p><strong>Conclusions: </strong>Our model achieved pathologist-level performance in diagnosing the presence or absence of coeliac disease from a representative set of duodenal biopsies, representing a significant advancement towards the adoption of machine learning in clinical practice. Additionally, it demonstrated strong generalisability, performing equally well on biopsies from a previously unseen hospital. We concluded that our model has the potential to revolutionise duodenal biopsy diagnosis by accurately identifying or ruling out coeliac disease, thereby significantly reducing the time required for pathologists to make a diagnosis.</p>","PeriodicalId":520343,"journal":{"name":"NEJM AI","volume":"2 4","pages":"aioa2400738"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7617718/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NEJM AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1056/AIoa2400738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The diagnosis of coeliac disease (CD), an autoimmune disorder with an estimated global prevalence of around 1%, generally relies on the histological examination of duodenal biopsies. However, inter-pathologist agreement for coeliac disease diagnosis is estimated to be no more than 80%. We aim to improve coeliac disease diagnosis by developing a novel, accurate, machine-learning-based diagnostic classifier.

Methods: We present a machine learning model that diagnoses the presence or absence of coeliac disease from a set of duodenal biopsies representative of real-world clinical data. Our model was trained on a diverse dataset of 3,383 -slide images (WSIs) of H&E-stained duodenal biopsies from four hospitals featuring five different WSI scanners along with their clinical diagnoses. We trained our model using the multiple-instance-learning paradigm in a weakly-supervised manner with cross-validation and evaluated it on an independent test set featuring 644 unseen scans from a different regional NHS Trust. Additionally, we compared the model's predictions to independent diagnoses from four specialist pathologists on a subset of the test data.

Results: Our model diagnosed coeliac disease in an independent test set from a previously unseen source with accuracy, sensitivity, and specificity exceeding 95% and an area under the ROC curve exceeding 99%. These results indicate that the model has the potential to outperform pathologists. In comparing the model's predictions to diagnoses on unseen test data from four independent pathologists, we found statistically indistinguishable results between pathologist-pathologist and pathologist-model inter-observer agreement (p > 96%).

Conclusions: Our model achieved pathologist-level performance in diagnosing the presence or absence of coeliac disease from a representative set of duodenal biopsies, representing a significant advancement towards the adoption of machine learning in clinical practice. Additionally, it demonstrated strong generalisability, performing equally well on biopsies from a previously unseen hospital. We concluded that our model has the potential to revolutionise duodenal biopsy diagnosis by accurately identifying or ruling out coeliac disease, thereby significantly reducing the time required for pathologists to make a diagnosis.

机器学习实现病理水平的乳糜泻诊断。
背景:乳糜泻(CD)是一种自身免疫性疾病,估计全球患病率约为1%,诊断通常依赖于十二指肠活检的组织学检查。然而,病理间对乳糜泻诊断的一致性估计不超过80%。我们的目标是通过开发一种新的、准确的、基于机器学习的诊断分类器来提高乳糜泻的诊断。方法:我们提出了一个机器学习模型,从一组代表现实世界临床数据的十二指肠活检中诊断是否存在乳糜泻。我们的模型是在来自四家医院的3,383张h&e染色十二指肠活检的幻灯片图像(WSI)的不同数据集上进行训练的,这些图像来自五种不同的WSI扫描仪以及他们的临床诊断。我们使用多实例学习范式以一种弱监督的交叉验证方式训练我们的模型,并在一个独立的测试集上对其进行评估,该测试集包含来自不同地区NHS信托的644个未见过的扫描。此外,我们将模型的预测与四名专业病理学家对测试数据子集的独立诊断进行了比较。结果:我们的模型在一个独立的测试集中从以前未见过的来源诊断乳糜泻,准确性、灵敏度和特异性超过95%,ROC曲线下面积超过99%。这些结果表明,该模型具有超越病理学家的潜力。将模型的预测与四位独立病理学家的未见测试数据的诊断进行比较,我们发现病理学家-病理学家和病理学家-模型之间的观察者一致性在统计上无法区分(p > 96%)。结论:我们的模型在诊断一组具有代表性的十二指肠活检是否存在乳糜泻方面达到了病理学水平,这代表了在临床实践中采用机器学习的重大进步。此外,它表现出很强的通用性,在以前从未见过的医院的活检中表现同样良好。我们的结论是,我们的模型有可能通过准确识别或排除乳糜泻来彻底改变十二指肠活检诊断,从而显着减少病理学家做出诊断所需的时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信