F Jaeckle, J Denholm, B Schreiber, S C Evans, M N Wicks, J Y H Chan, A C Bateman, S Natu, M J Arends, E Soilleux
{"title":"Machine Learning Achieves Pathologist-Level Coeliac Disease Diagnosis.","authors":"F Jaeckle, J Denholm, B Schreiber, S C Evans, M N Wicks, J Y H Chan, A C Bateman, S Natu, M J Arends, E Soilleux","doi":"10.1056/AIoa2400738","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The diagnosis of coeliac disease (CD), an autoimmune disorder with an estimated global prevalence of around 1%, generally relies on the histological examination of duodenal biopsies. However, inter-pathologist agreement for coeliac disease diagnosis is estimated to be no more than 80%. We aim to improve coeliac disease diagnosis by developing a novel, accurate, machine-learning-based diagnostic classifier.</p><p><strong>Methods: </strong>We present a machine learning model that diagnoses the presence or absence of coeliac disease from a set of duodenal biopsies representative of real-world clinical data. Our model was trained on a diverse dataset of 3,383 -slide images (WSIs) of H&E-stained duodenal biopsies from four hospitals featuring five different WSI scanners along with their clinical diagnoses. We trained our model using the multiple-instance-learning paradigm in a weakly-supervised manner with cross-validation and evaluated it on an independent test set featuring 644 unseen scans from a different regional NHS Trust. Additionally, we compared the model's predictions to independent diagnoses from four specialist pathologists on a subset of the test data.</p><p><strong>Results: </strong>Our model diagnosed coeliac disease in an independent test set from a previously unseen source with accuracy, sensitivity, and specificity exceeding 95% and an area under the ROC curve exceeding 99%. These results indicate that the model has the potential to outperform pathologists. In comparing the model's predictions to diagnoses on unseen test data from four independent pathologists, we found statistically indistinguishable results between pathologist-pathologist and pathologist-model inter-observer agreement (<i>p</i> > 96%).</p><p><strong>Conclusions: </strong>Our model achieved pathologist-level performance in diagnosing the presence or absence of coeliac disease from a representative set of duodenal biopsies, representing a significant advancement towards the adoption of machine learning in clinical practice. Additionally, it demonstrated strong generalisability, performing equally well on biopsies from a previously unseen hospital. We concluded that our model has the potential to revolutionise duodenal biopsy diagnosis by accurately identifying or ruling out coeliac disease, thereby significantly reducing the time required for pathologists to make a diagnosis.</p>","PeriodicalId":520343,"journal":{"name":"NEJM AI","volume":"2 4","pages":"aioa2400738"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7617718/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NEJM AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1056/AIoa2400738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The diagnosis of coeliac disease (CD), an autoimmune disorder with an estimated global prevalence of around 1%, generally relies on the histological examination of duodenal biopsies. However, inter-pathologist agreement for coeliac disease diagnosis is estimated to be no more than 80%. We aim to improve coeliac disease diagnosis by developing a novel, accurate, machine-learning-based diagnostic classifier.
Methods: We present a machine learning model that diagnoses the presence or absence of coeliac disease from a set of duodenal biopsies representative of real-world clinical data. Our model was trained on a diverse dataset of 3,383 -slide images (WSIs) of H&E-stained duodenal biopsies from four hospitals featuring five different WSI scanners along with their clinical diagnoses. We trained our model using the multiple-instance-learning paradigm in a weakly-supervised manner with cross-validation and evaluated it on an independent test set featuring 644 unseen scans from a different regional NHS Trust. Additionally, we compared the model's predictions to independent diagnoses from four specialist pathologists on a subset of the test data.
Results: Our model diagnosed coeliac disease in an independent test set from a previously unseen source with accuracy, sensitivity, and specificity exceeding 95% and an area under the ROC curve exceeding 99%. These results indicate that the model has the potential to outperform pathologists. In comparing the model's predictions to diagnoses on unseen test data from four independent pathologists, we found statistically indistinguishable results between pathologist-pathologist and pathologist-model inter-observer agreement (p > 96%).
Conclusions: Our model achieved pathologist-level performance in diagnosing the presence or absence of coeliac disease from a representative set of duodenal biopsies, representing a significant advancement towards the adoption of machine learning in clinical practice. Additionally, it demonstrated strong generalisability, performing equally well on biopsies from a previously unseen hospital. We concluded that our model has the potential to revolutionise duodenal biopsy diagnosis by accurately identifying or ruling out coeliac disease, thereby significantly reducing the time required for pathologists to make a diagnosis.