Sanja Bogdanovic, Matthias Staib, Marco Schleiniger, Livio Steiner, Leonardo Schwarz, Christoph Germann, Reto Sutter, Benjamin Fritz
{"title":"AI-Based Measurement of Lumbar Spinal Stenosis on MRI: External Evaluation of a Fully Automated Model.","authors":"Sanja Bogdanovic, Matthias Staib, Marco Schleiniger, Livio Steiner, Leonardo Schwarz, Christoph Germann, Reto Sutter, Benjamin Fritz","doi":"10.1097/RLI.0000000000001070","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>The aim of this study was to clinically validate a fully automated AI model for magnetic resonance imaging (MRI)-based quantifications of lumbar spinal canal stenosis.</p><p><strong>Materials and methods: </strong>This retrospective study included lumbar spine MRI of 100 consecutive clinical patients (56 ± 17 years; 43 females, 57 males) performed on clinical 1.5 (51 examinations) and 3 T MRI scanners (49 examinations) with heterogeneous clinical imaging protocols. The AI model performed segmentations of the thecal sac on axial T2-weighted sequences. Based on these segmentations, the anteroposterior (AP) and mediolateral (ML) distance, and the area of the thecal sac were measured in a fully automated manner. For comparison, 2 fellowship-trained musculoskeletal radiologists performed the same segmentations and measurements independently. Statistics included 1-sample t tests, the intraclass correlation coefficient (ICC), Bland-Altman plots, and Dice coefficients. A P value of <0.05 was considered statistically significant.</p><p><strong>Results: </strong>The average measurements of the AI model, reader 1, and reader 2 were 194 ± 72 mm 2 , 181 ± 71 mm 2 , and 179 ± 70 mm 2 for thecal sac area, 13 ± 3.3 mm, 12.6 ± 3.3 mm, and 12.6 ± 3.2 mm for AP distance, and 19.5 ± 3.9 mm, 20 ± 4.3 mm, and 19.4 ± 4 mm for ML distance, respectively. Significant differences existed for all pairwise comparisons, besides reader 1 versus AI model for the ML distance and reader 1 versus reader 2 for the AP distance ( P = 0.1 and P = 0.21, respectively). The pairwise mean absolute errors among reader 1, reader 2, and the AI model ranged from 0.59 mm and 0.75 mm for the AP distance, from 1.16 mm to 1.37 mm for the ML distance, and from 7.9 mm 2 to 15.54 mm 2 for the thecal sac area. Pairwise ICCs among reader 1, reader 2, and the AI model ranged from 0.91 and 0.94 for the AP distance and from 0.86 to 0.9 for the ML distance without significant differences. For the thecal sac area, the pairwise ICC between both readers and the AI model of 0.97 each was slightly, but significantly lower than the ICC between reader 1 and reader 2 of 0.99. Similarly, the Dice coefficient and Hausdorff distance between both readers and the AI model were significantly lower than the values between reader 1 and reader 2, overall ranging from 0.93 to 0.95 for the Dice coefficients and 1.1 to 1.44 for the Hausdorff distances.</p><p><strong>Conclusions: </strong>The investigated AI model is reliable for assessing the AP and the ML thecal sac diameters with human level accuracies. The small differences for measurement and segmentation of the thecal sac area between the AI model and the radiologists are likely within a clinically acceptable range.</p>","PeriodicalId":14486,"journal":{"name":"Investigative Radiology","volume":" ","pages":"656-666"},"PeriodicalIF":7.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Investigative Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/RLI.0000000000001070","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/1 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: The aim of this study was to clinically validate a fully automated AI model for magnetic resonance imaging (MRI)-based quantifications of lumbar spinal canal stenosis.
Materials and methods: This retrospective study included lumbar spine MRI of 100 consecutive clinical patients (56 ± 17 years; 43 females, 57 males) performed on clinical 1.5 (51 examinations) and 3 T MRI scanners (49 examinations) with heterogeneous clinical imaging protocols. The AI model performed segmentations of the thecal sac on axial T2-weighted sequences. Based on these segmentations, the anteroposterior (AP) and mediolateral (ML) distance, and the area of the thecal sac were measured in a fully automated manner. For comparison, 2 fellowship-trained musculoskeletal radiologists performed the same segmentations and measurements independently. Statistics included 1-sample t tests, the intraclass correlation coefficient (ICC), Bland-Altman plots, and Dice coefficients. A P value of <0.05 was considered statistically significant.
Results: The average measurements of the AI model, reader 1, and reader 2 were 194 ± 72 mm 2 , 181 ± 71 mm 2 , and 179 ± 70 mm 2 for thecal sac area, 13 ± 3.3 mm, 12.6 ± 3.3 mm, and 12.6 ± 3.2 mm for AP distance, and 19.5 ± 3.9 mm, 20 ± 4.3 mm, and 19.4 ± 4 mm for ML distance, respectively. Significant differences existed for all pairwise comparisons, besides reader 1 versus AI model for the ML distance and reader 1 versus reader 2 for the AP distance ( P = 0.1 and P = 0.21, respectively). The pairwise mean absolute errors among reader 1, reader 2, and the AI model ranged from 0.59 mm and 0.75 mm for the AP distance, from 1.16 mm to 1.37 mm for the ML distance, and from 7.9 mm 2 to 15.54 mm 2 for the thecal sac area. Pairwise ICCs among reader 1, reader 2, and the AI model ranged from 0.91 and 0.94 for the AP distance and from 0.86 to 0.9 for the ML distance without significant differences. For the thecal sac area, the pairwise ICC between both readers and the AI model of 0.97 each was slightly, but significantly lower than the ICC between reader 1 and reader 2 of 0.99. Similarly, the Dice coefficient and Hausdorff distance between both readers and the AI model were significantly lower than the values between reader 1 and reader 2, overall ranging from 0.93 to 0.95 for the Dice coefficients and 1.1 to 1.44 for the Hausdorff distances.
Conclusions: The investigated AI model is reliable for assessing the AP and the ML thecal sac diameters with human level accuracies. The small differences for measurement and segmentation of the thecal sac area between the AI model and the radiologists are likely within a clinically acceptable range.
期刊介绍:
Investigative Radiology publishes original, peer-reviewed reports on clinical and laboratory investigations in diagnostic imaging, the diagnostic use of radioactive isotopes, computed tomography, positron emission tomography, magnetic resonance imaging, ultrasound, digital subtraction angiography, and related modalities. Emphasis is on early and timely publication. Primarily research-oriented, the journal also includes a wide variety of features of interest to clinical radiologists.