Boj Friedrich Hoppe, Johannes Rueckel, Jan Rudolph, Nicola Fink, Simon Weidert, Wolf Hohlbein, Adrian Cavalcanti-Kußmaul, Lena Trappmann, Basel Munawwar, Jens Ricke, Bastian Oliver Sabel
{"title":"Automated spinopelvic measurements on radiographs with artificial intelligence: a multi-reader study.","authors":"Boj Friedrich Hoppe, Johannes Rueckel, Jan Rudolph, Nicola Fink, Simon Weidert, Wolf Hohlbein, Adrian Cavalcanti-Kußmaul, Lena Trappmann, Basel Munawwar, Jens Ricke, Bastian Oliver Sabel","doi":"10.1007/s11547-025-01957-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To develop an artificial intelligence (AI) algorithm for automated measurements of spinopelvic parameters on lateral radiographs and compare its performance to multiple experienced radiologists and surgeons.</p><p><strong>Methods: </strong>On lateral full-spine radiographs of 295 consecutive patients, a two-staged region-based convolutional neural network (R-CNN) was trained to detect anatomical landmarks and calculate thoracic kyphosis (TK), lumbar lordosis (LL), sacral slope (SS), and sagittal vertical axis (SVA). Performance was evaluated on 65 radiographs not used for training, which were measured independently by 6 readers (3 radiologists, 3 surgeons), and the median per measurement was set as the reference standard. Intraclass correlation coefficient (ICC), mean absolute error (MAE), and standard deviation (SD) were used for statistical analysis; while, ANOVA was used to search for significant differences between the AI and human readers.</p><p><strong>Results: </strong>Automatic measurements (AI) showed excellent correlation with the reference standard, with all ICCs within the range of the readers (TK: 0.92 [AI] vs. 0.85-0.96 [readers]; LL: 0.95 vs. 0.87-0.98; SS: 0.93 vs. 0.89-0.98; SVA: 1.00 vs. 0.99-1.00; all p < 0.001). Analysis of the MAE (± SD) revealed comparable results to the six readers (TK: 3.71° (± 4.24) [AI] v.s 1.86-5.88° (± 3.48-6.17) [readers]; LL: 4.53° ± 4.68 vs. 2.21-5.34° (± 2.60-7.38); SS: 4.56° (± 6.10) vs. 2.20-4.76° (± 3.15-7.37); SVA: 2.44 mm (± 3.93) vs. 1.22-2.79 mm (± 2.42-7.11)); while, ANOVA confirmed no significant difference between the errors of the AI and any human reader (all p > 0.05). Human reading time was on average 139 s per case (range: 86-231 s).</p><p><strong>Conclusion: </strong>Our AI algorithm provides spinopelvic measurements accurate within the variability of experienced readers, but with the potential to save time and increase reproducibility.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":" ","pages":"359-367"},"PeriodicalIF":9.7000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11903605/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiologia Medica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11547-025-01957-5","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/26 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To develop an artificial intelligence (AI) algorithm for automated measurements of spinopelvic parameters on lateral radiographs and compare its performance to multiple experienced radiologists and surgeons.
Methods: On lateral full-spine radiographs of 295 consecutive patients, a two-staged region-based convolutional neural network (R-CNN) was trained to detect anatomical landmarks and calculate thoracic kyphosis (TK), lumbar lordosis (LL), sacral slope (SS), and sagittal vertical axis (SVA). Performance was evaluated on 65 radiographs not used for training, which were measured independently by 6 readers (3 radiologists, 3 surgeons), and the median per measurement was set as the reference standard. Intraclass correlation coefficient (ICC), mean absolute error (MAE), and standard deviation (SD) were used for statistical analysis; while, ANOVA was used to search for significant differences between the AI and human readers.
Results: Automatic measurements (AI) showed excellent correlation with the reference standard, with all ICCs within the range of the readers (TK: 0.92 [AI] vs. 0.85-0.96 [readers]; LL: 0.95 vs. 0.87-0.98; SS: 0.93 vs. 0.89-0.98; SVA: 1.00 vs. 0.99-1.00; all p < 0.001). Analysis of the MAE (± SD) revealed comparable results to the six readers (TK: 3.71° (± 4.24) [AI] v.s 1.86-5.88° (± 3.48-6.17) [readers]; LL: 4.53° ± 4.68 vs. 2.21-5.34° (± 2.60-7.38); SS: 4.56° (± 6.10) vs. 2.20-4.76° (± 3.15-7.37); SVA: 2.44 mm (± 3.93) vs. 1.22-2.79 mm (± 2.42-7.11)); while, ANOVA confirmed no significant difference between the errors of the AI and any human reader (all p > 0.05). Human reading time was on average 139 s per case (range: 86-231 s).
Conclusion: Our AI algorithm provides spinopelvic measurements accurate within the variability of experienced readers, but with the potential to save time and increase reproducibility.
目的:开发一种人工智能(AI)算法,用于自动测量侧位片上的脊柱骨盆参数,并将其与多名经验丰富的放射科医生和外科医生的表现进行比较。方法:在295例连续患者的侧位全脊柱x线片上,训练两阶段基于区域的卷积神经网络(R-CNN)来检测解剖标志并计算胸后凸(TK)、腰椎前凸(LL)、骶骨斜率(SS)和矢状垂直轴(SVA)。对65张未用于培训的x线片进行性能评估,由6名阅读者(3名放射科医生,3名外科医生)独立测量,每次测量的中位数作为参考标准。采用类内相关系数(ICC)、平均绝对误差(MAE)和标准差(SD)进行统计分析;同时,使用方差分析来寻找人工智能和人类读者之间的显著差异。结果:自动测量(AI)与参考标准具有良好的相关性,ICCs均在阅读器的范围内(TK: 0.92 [AI] vs. 0.85-0.96[阅读器];LL: 0.95 vs. 0.87-0.98;SS: 0.93 vs. 0.89-0.98;SVA: 1.00 vs. 0.99-1.00;p < 0.05)。人类的阅读时间平均为每例139秒(范围:86-231秒)。结论:我们的人工智能算法在经验丰富的读者的可变范围内提供准确的脊柱骨盆测量,但具有节省时间和提高可重复性的潜力。
期刊介绍:
Felice Perussia founded La radiologia medica in 1914. It is a peer-reviewed journal and serves as the official journal of the Italian Society of Medical and Interventional Radiology (SIRM). The primary purpose of the journal is to disseminate information related to Radiology, especially advancements in diagnostic imaging and related disciplines. La radiologia medica welcomes original research on both fundamental and clinical aspects of modern radiology, with a particular focus on diagnostic and interventional imaging techniques. It also covers topics such as radiotherapy, nuclear medicine, radiobiology, health physics, and artificial intelligence in the context of clinical implications. The journal includes various types of contributions such as original articles, review articles, editorials, short reports, and letters to the editor. With an esteemed Editorial Board and a selection of insightful reports, the journal is an indispensable resource for radiologists and professionals in related fields. Ultimately, La radiologia medica aims to serve as a platform for international collaboration and knowledge sharing within the radiological community.