Amir Jamaludin, Rhydian Windsor, Sarim Ather, Timor Kadir, Andrew Zisserman, Juergen Braun, Lianne S Gensler, Mikkel Østergaard, Denis Poddubnyy, Thibaud Coroller, Brian Porter, Gregory Ligozio, Aimee Readie, Pedro M Machado
{"title":"Automated detection of spinal bone marrow oedema in axial spondyloarthritis: training and validation using two large phase 3 trial datasets","authors":"Amir Jamaludin, Rhydian Windsor, Sarim Ather, Timor Kadir, Andrew Zisserman, Juergen Braun, Lianne S Gensler, Mikkel Østergaard, Denis Poddubnyy, Thibaud Coroller, Brian Porter, Gregory Ligozio, Aimee Readie, Pedro M Machado","doi":"10.1093/rheumatology/keaf323","DOIUrl":null,"url":null,"abstract":"Objective To evaluate the performance of machine learning (ML) models for the automated scoring of spinal MRI bone marrow oedema (BMO) in patients with axial spondyloarthritis (axSpA) and compare them with expert scoring. Methods ML algorithms using SpineNet software were trained and validated on 3483 spinal MRIs from 686 axSpA patients across two clinical trial datasets. The scoring pipeline involved (i) detection and labelling of vertebral bodies and (ii) classification of vertebral units for the presence or absence of BMO. Two models were tested: Model 1, without manual segmentation, and Model 2, incorporating an intermediate manual segmentation step. Model outputs were compared with those of human experts using kappa statistics, balanced accuracy, sensitivity, specificity, and AUC. Results Both models performed comparably to expert readers, regarding presence vs absence of BMO. Model 1 outperformed Model 2, with an AUC of 0.94 (vs 0.88), accuracy of 75.8% (vs 70.5%), and kappa of 0.50 (vs 0.31), using absolute reader consensus scoring as the external reference; this performance was similar to the expert inter-reader accuracy of 76.8% and kappa of 0.47, in a radiographic axSpA dataset. In a non-radiographic axSpA dataset, Model 1 achieved an AUC of 0.97 (vs 0.91 for Model 2), accuracy of 74.6% (vs 70%), and kappa of 0.52 (vs 0.27), comparable to the expert inter-reader accuracy of 74.2% and kappa of 0.46. Conclusion ML software shows potential for automated MRI BMO assessment in axSpA, offering benefits such as improved consistency, reduced labour costs, and minimised inter- and intra-reader variability. Trial registration Clinicaltrials.gov, MEASURE 1 study (NCT01358175); PREVENT study (NCT02696031)","PeriodicalId":21255,"journal":{"name":"Rheumatology","volume":"36 1","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Rheumatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/rheumatology/keaf323","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RHEUMATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective To evaluate the performance of machine learning (ML) models for the automated scoring of spinal MRI bone marrow oedema (BMO) in patients with axial spondyloarthritis (axSpA) and compare them with expert scoring. Methods ML algorithms using SpineNet software were trained and validated on 3483 spinal MRIs from 686 axSpA patients across two clinical trial datasets. The scoring pipeline involved (i) detection and labelling of vertebral bodies and (ii) classification of vertebral units for the presence or absence of BMO. Two models were tested: Model 1, without manual segmentation, and Model 2, incorporating an intermediate manual segmentation step. Model outputs were compared with those of human experts using kappa statistics, balanced accuracy, sensitivity, specificity, and AUC. Results Both models performed comparably to expert readers, regarding presence vs absence of BMO. Model 1 outperformed Model 2, with an AUC of 0.94 (vs 0.88), accuracy of 75.8% (vs 70.5%), and kappa of 0.50 (vs 0.31), using absolute reader consensus scoring as the external reference; this performance was similar to the expert inter-reader accuracy of 76.8% and kappa of 0.47, in a radiographic axSpA dataset. In a non-radiographic axSpA dataset, Model 1 achieved an AUC of 0.97 (vs 0.91 for Model 2), accuracy of 74.6% (vs 70%), and kappa of 0.52 (vs 0.27), comparable to the expert inter-reader accuracy of 74.2% and kappa of 0.46. Conclusion ML software shows potential for automated MRI BMO assessment in axSpA, offering benefits such as improved consistency, reduced labour costs, and minimised inter- and intra-reader variability. Trial registration Clinicaltrials.gov, MEASURE 1 study (NCT01358175); PREVENT study (NCT02696031)
期刊介绍:
Rheumatology strives to support research and discovery by publishing the highest quality original scientific papers with a focus on basic, clinical and translational research. The journal’s subject areas cover a wide range of paediatric and adult rheumatological conditions from an international perspective. It is an official journal of the British Society for Rheumatology, published by Oxford University Press.
Rheumatology publishes original articles, reviews, editorials, guidelines, concise reports, meta-analyses, original case reports, clinical vignettes, letters and matters arising from published material. The journal takes pride in serving the global rheumatology community, with a focus on high societal impact in the form of podcasts, videos and extended social media presence, and utilizing metrics such as Altmetric. Keep up to date by following the journal on Twitter @RheumJnl.