Andy Tsai, Sandip Samal, Paul Lamonica, Nicole Morris, John McNeil, Rudolph Pienaar
{"title":"使用开源平台测量肢体长度差异的人工智能模型的临床部署和前瞻性验证。","authors":"Andy Tsai, Sandip Samal, Paul Lamonica, Nicole Morris, John McNeil, Rudolph Pienaar","doi":"10.1007/s00330-025-12022-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To deploy an AI model to measure limb-length discrepancy (LLD) and prospectively validate its performance.</p><p><strong>Materials and methods: </strong>We encoded the inference of an LLD AI model into a docker container, incorporated it into a computational platform for clinical deployment, and conducted two prospective validation studies: a shadow trial (07/2024-9/2024) and a clinical trial (11/2024-01/2025). During each trial period, we queried for LLD EOS scanograms to serve as inputs to our model. For the shadow trial, we hid the AI-annotated outputs from the radiologists, and for the clinical trial, we displayed the AI-annotated output to the radiologists at the time of study interpretation. Afterward, we collected the bilateral femoral and tibial lengths from the radiology reports and compared them against those generated by the AI model. We used median absolute difference (MAD) and interquartile range (IQR) as summary statistics to assess the performance of our model.</p><p><strong>Results: </strong>Our shadow trial consisted of 84 EOS scanograms from 84 children, with 168 femoral and tibial lengths. The MAD (IQR) of the femoral and tibial lengths were 0.2 cm (0.3 cm) and 0.2 cm (0.3 cm), respectively. Our clinical trial consisted of 114 EOS scanograms from 114 children, with 228 femoral and tibial lengths. The MAD (IQR) of the femoral and tibial lengths were 0.3 cm (0.4 cm) and 0.2 cm (0.3 cm), respectively.</p><p><strong>Conclusion: </strong>We successfully employed a computational platform for seamless integration and deployment of an LLD AI model into our clinical workflow, and prospectively validated its performance.</p><p><strong>Key points: </strong>Question No AI models have been clinically deployed for limb-length discrepancy (LLD) assessment in children, and the prospective validation of these models is unknown. Findings We deployed an LLD AI model using a homegrown platform, with prospective trials showing a median absolute difference of 0.2-0.3 cm in estimating bone lengths. Clinical relevance An LLD AI model with performance comparable to that of radiologists can serve as a secondary reader in increasing the confidence and accuracy of LLD measurements.</p>","PeriodicalId":12076,"journal":{"name":"European Radiology","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clinical deployment and prospective validation of an AI model for limb-length discrepancy measurements using an open-source platform.\",\"authors\":\"Andy Tsai, Sandip Samal, Paul Lamonica, Nicole Morris, John McNeil, Rudolph Pienaar\",\"doi\":\"10.1007/s00330-025-12022-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>To deploy an AI model to measure limb-length discrepancy (LLD) and prospectively validate its performance.</p><p><strong>Materials and methods: </strong>We encoded the inference of an LLD AI model into a docker container, incorporated it into a computational platform for clinical deployment, and conducted two prospective validation studies: a shadow trial (07/2024-9/2024) and a clinical trial (11/2024-01/2025). During each trial period, we queried for LLD EOS scanograms to serve as inputs to our model. For the shadow trial, we hid the AI-annotated outputs from the radiologists, and for the clinical trial, we displayed the AI-annotated output to the radiologists at the time of study interpretation. Afterward, we collected the bilateral femoral and tibial lengths from the radiology reports and compared them against those generated by the AI model. We used median absolute difference (MAD) and interquartile range (IQR) as summary statistics to assess the performance of our model.</p><p><strong>Results: </strong>Our shadow trial consisted of 84 EOS scanograms from 84 children, with 168 femoral and tibial lengths. The MAD (IQR) of the femoral and tibial lengths were 0.2 cm (0.3 cm) and 0.2 cm (0.3 cm), respectively. Our clinical trial consisted of 114 EOS scanograms from 114 children, with 228 femoral and tibial lengths. The MAD (IQR) of the femoral and tibial lengths were 0.3 cm (0.4 cm) and 0.2 cm (0.3 cm), respectively.</p><p><strong>Conclusion: </strong>We successfully employed a computational platform for seamless integration and deployment of an LLD AI model into our clinical workflow, and prospectively validated its performance.</p><p><strong>Key points: </strong>Question No AI models have been clinically deployed for limb-length discrepancy (LLD) assessment in children, and the prospective validation of these models is unknown. Findings We deployed an LLD AI model using a homegrown platform, with prospective trials showing a median absolute difference of 0.2-0.3 cm in estimating bone lengths. Clinical relevance An LLD AI model with performance comparable to that of radiologists can serve as a secondary reader in increasing the confidence and accuracy of LLD measurements.</p>\",\"PeriodicalId\":12076,\"journal\":{\"name\":\"European Radiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00330-025-12022-0\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00330-025-12022-0","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
摘要
目的:利用人工智能模型测量肢体长度差异(LLD),并对其性能进行前瞻性验证。材料与方法:我们将LLD人工智能模型的推理编码到docker容器中,并将其纳入临床部署的计算平台,进行两项前瞻性验证研究:影子试验(2024年7月- 2024年9月)和临床试验(2024年11月- 2025年1月)。在每个试验期间,我们查询LLD EOS扫描图作为我们模型的输入。对于影子试验,我们对放射科医生隐藏了ai注释的输出,对于临床试验,我们在研究解释时向放射科医生显示ai注释的输出。随后,我们从放射学报告中收集双侧股骨和胫骨长度,并将其与AI模型生成的长度进行比较。我们使用中位数绝对差(MAD)和四分位间距(IQR)作为汇总统计来评估我们模型的性能。结果:我们的阴影试验包括84个EOS扫描图,来自84个儿童,168个股骨和胫骨长度。股骨和胫骨长度的MAD (IQR)分别为0.2 cm (0.3 cm)和0.2 cm (0.3 cm)。我们的临床试验包括114个儿童的114张EOS扫描图,228个股骨和胫骨长度。股骨和胫骨长度的MAD (IQR)分别为0.3 cm (0.4 cm)和0.2 cm (0.3 cm)。结论:我们成功地使用了一个计算平台,将LLD AI模型无缝集成和部署到我们的临床工作流程中,并对其性能进行了前瞻性验证。目前还没有人工智能模型在临床应用于儿童肢体长度差异(LLD)评估,这些模型的前瞻性验证尚不清楚。我们使用一个国产平台部署了LLD人工智能模型,前瞻性试验显示估计骨长度的中位数绝对差异为0.2-0.3 cm。临床相关性LLD人工智能模型的性能与放射科医生相当,可以作为辅助阅读器,提高LLD测量的信心和准确性。
Clinical deployment and prospective validation of an AI model for limb-length discrepancy measurements using an open-source platform.
Objectives: To deploy an AI model to measure limb-length discrepancy (LLD) and prospectively validate its performance.
Materials and methods: We encoded the inference of an LLD AI model into a docker container, incorporated it into a computational platform for clinical deployment, and conducted two prospective validation studies: a shadow trial (07/2024-9/2024) and a clinical trial (11/2024-01/2025). During each trial period, we queried for LLD EOS scanograms to serve as inputs to our model. For the shadow trial, we hid the AI-annotated outputs from the radiologists, and for the clinical trial, we displayed the AI-annotated output to the radiologists at the time of study interpretation. Afterward, we collected the bilateral femoral and tibial lengths from the radiology reports and compared them against those generated by the AI model. We used median absolute difference (MAD) and interquartile range (IQR) as summary statistics to assess the performance of our model.
Results: Our shadow trial consisted of 84 EOS scanograms from 84 children, with 168 femoral and tibial lengths. The MAD (IQR) of the femoral and tibial lengths were 0.2 cm (0.3 cm) and 0.2 cm (0.3 cm), respectively. Our clinical trial consisted of 114 EOS scanograms from 114 children, with 228 femoral and tibial lengths. The MAD (IQR) of the femoral and tibial lengths were 0.3 cm (0.4 cm) and 0.2 cm (0.3 cm), respectively.
Conclusion: We successfully employed a computational platform for seamless integration and deployment of an LLD AI model into our clinical workflow, and prospectively validated its performance.
Key points: Question No AI models have been clinically deployed for limb-length discrepancy (LLD) assessment in children, and the prospective validation of these models is unknown. Findings We deployed an LLD AI model using a homegrown platform, with prospective trials showing a median absolute difference of 0.2-0.3 cm in estimating bone lengths. Clinical relevance An LLD AI model with performance comparable to that of radiologists can serve as a secondary reader in increasing the confidence and accuracy of LLD measurements.
期刊介绍:
European Radiology (ER) continuously updates scientific knowledge in radiology by publication of strong original articles and state-of-the-art reviews written by leading radiologists. A well balanced combination of review articles, original papers, short communications from European radiological congresses and information on society matters makes ER an indispensable source for current information in this field.
This is the Journal of the European Society of Radiology, and the official journal of a number of societies.
From 2004-2008 supplements to European Radiology were published under its companion, European Radiology Supplements, ISSN 1613-3749.