A critical comparative study of the performance of three AI-assisted programs for bone age determination.

IF 4.7 2区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Johanna Pape, Maciej Rosolowski, Roland Pfäffle, Anne B Beeskow, Daniel Gräfe
{"title":"A critical comparative study of the performance of three AI-assisted programs for bone age determination.","authors":"Johanna Pape, Maciej Rosolowski, Roland Pfäffle, Anne B Beeskow, Daniel Gräfe","doi":"10.1007/s00330-024-11169-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To date, AI-supported programs for bone age (BA) determination for medical use in Europe have almost only been validated separately, according to Greulich and Pyle (G&P). Therefore, the current study aimed to compare the performance of three programs, namely BoneXpert, PANDA, and BoneView, on a single Central European population.</p><p><strong>Materials and methods: </strong>For this retrospective study, hand radiographs of 306 children aged 1-18 years, stratified by gender and age, were included. A subgroup consisting of the age group accounting for 90% of examinations in clinical practice was formed. The G&P BA was estimated by three human experts-as ground truth-and three AI-supported programs. The mean absolute deviation, the root mean squared error (RMSE), and dropouts by the AI were calculated.</p><p><strong>Results: </strong>The correlation between all programs and the ground truth was prominent (R<sup>2</sup> ≥ 0.98). In the total group, BoneXpert had a lower RMSE than BoneView and PANDA (0.62 vs. 0.65 and 0.75 years) with a dropout rate of 2.3%, 20.3% and 0%, respectively. In the subgroup, there was less difference in RMSE (0.66 vs. 0.68 and 0.65 years, max. 4% dropouts). The standard deviation between the AI readers was lower than that between the human readers (0.54 vs. 0.62 years, p < 0.01).</p><p><strong>Conclusion: </strong>All three AI programs predict BA after G&P in the main age range with similar high reliability. Differences arise at the boundaries of childhood.</p><p><strong>Key points: </strong>Question There is a lack of comparative, independent validation for artificial intelligence-based bone age estimation in children. Findings Three commercially available programs estimate bone age after Greulich and Pyle with similarly high reliability in a central European cohort. Clinical relevance The comparative study will help the reader choose a software for bone age estimation approved for the European market depending on the targeted age group and economic considerations.</p>","PeriodicalId":12076,"journal":{"name":"European Radiology","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00330-024-11169-6","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: To date, AI-supported programs for bone age (BA) determination for medical use in Europe have almost only been validated separately, according to Greulich and Pyle (G&P). Therefore, the current study aimed to compare the performance of three programs, namely BoneXpert, PANDA, and BoneView, on a single Central European population.

Materials and methods: For this retrospective study, hand radiographs of 306 children aged 1-18 years, stratified by gender and age, were included. A subgroup consisting of the age group accounting for 90% of examinations in clinical practice was formed. The G&P BA was estimated by three human experts-as ground truth-and three AI-supported programs. The mean absolute deviation, the root mean squared error (RMSE), and dropouts by the AI were calculated.

Results: The correlation between all programs and the ground truth was prominent (R2 ≥ 0.98). In the total group, BoneXpert had a lower RMSE than BoneView and PANDA (0.62 vs. 0.65 and 0.75 years) with a dropout rate of 2.3%, 20.3% and 0%, respectively. In the subgroup, there was less difference in RMSE (0.66 vs. 0.68 and 0.65 years, max. 4% dropouts). The standard deviation between the AI readers was lower than that between the human readers (0.54 vs. 0.62 years, p < 0.01).

Conclusion: All three AI programs predict BA after G&P in the main age range with similar high reliability. Differences arise at the boundaries of childhood.

Key points: Question There is a lack of comparative, independent validation for artificial intelligence-based bone age estimation in children. Findings Three commercially available programs estimate bone age after Greulich and Pyle with similarly high reliability in a central European cohort. Clinical relevance The comparative study will help the reader choose a software for bone age estimation approved for the European market depending on the targeted age group and economic considerations.

对三种人工智能辅助骨龄测定程序性能的重要比较研究。
目的:根据 Greulich 和 Pyle (G&P)的说法,迄今为止,欧洲用于医学用途的人工智能支持的骨龄(BA)测定程序几乎只进行过单独验证。因此,本研究旨在比较 BoneXpert、PANDA 和 BoneView 这三种程序在单一中欧人群中的表现:在这项回顾性研究中,共纳入了 306 名 1-18 岁儿童的手部 X 光片,并按性别和年龄进行了分层。在临床实践中,90%的检查都是在这一年龄组进行的。G&P BA 由三位人类专家(作为基本事实)和三个人工智能支持的程序估算。计算了平均绝对偏差、均方根误差(RMSE)和人工智能的遗漏率:结果:所有程序与基本真相之间的相关性都很突出(R2 ≥ 0.98)。在总群体中,BoneXpert 的 RMSE 低于 BoneView 和 PANDA(0.62 对 0.65 和 0.75 年),辍学率分别为 2.3%、20.3% 和 0%。在亚组中,RMSE 的差异较小(0.66 对 0.68 和 0.65 岁,最大辍学率为 4%)。人工智能阅读器之间的标准偏差低于人类阅读器之间的标准偏差(0.54 对 0.62 岁,P 结论:人工智能阅读器和人类阅读器之间的标准偏差较小:所有三种人工智能程序都能预测主要年龄段的 G&P 后 BA 值,且具有相似的高可靠性。差异出现在儿童期的边界:问题 基于人工智能的儿童骨龄估计缺乏独立的比较验证。研究结果 三种市售程序以 Greulich 和 Pyle 为蓝本估算骨龄,在中欧队列中具有类似的高可靠性。临床相关性 这项比较研究将帮助读者根据目标年龄组和经济因素,选择适合欧洲市场的骨龄估计软件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
European Radiology
European Radiology 医学-核医学
CiteScore
11.60
自引率
8.50%
发文量
874
审稿时长
2-4 weeks
期刊介绍: European Radiology (ER) continuously updates scientific knowledge in radiology by publication of strong original articles and state-of-the-art reviews written by leading radiologists. A well balanced combination of review articles, original papers, short communications from European radiological congresses and information on society matters makes ER an indispensable source for current information in this field. This is the Journal of the European Society of Radiology, and the official journal of a number of societies. From 2004-2008 supplements to European Radiology were published under its companion, European Radiology Supplements, ISSN 1613-3749.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信