人工智能在伊朗儿童和青少年骨龄评估中的潜力:一项探索性研究。

IF 1 4区 医学 Q3 MEDICINE, GENERAL & INTERNAL
Mehrzad Lotfi, Nahid Abolpour, Mohammadreza Ghasemi, Hajar Heydari, Reza Pourghayumi
{"title":"人工智能在伊朗儿童和青少年骨龄评估中的潜力:一项探索性研究。","authors":"Mehrzad Lotfi, Nahid Abolpour, Mohammadreza Ghasemi, Hajar Heydari, Reza Pourghayumi","doi":"10.34172/aim.32070","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>To investigate whether the bone age (BA) of Iranian children could be accurately assessed via an artificial intelligence (AI) system. Accurate assessment of skeletal maturity is crucial for diagnosing and treating various musculoskeletal disorders, and is traditionally achieved through manual comparison with the Greulich-Pyle atlas. This process, however, is subjective and time-consuming. Recent advances in deep learning offer more efficient and consistent BA evaluations.</p><p><strong>Methods: </strong>From left-hand radiographs of children aged 1-18 years who presented to a tertiary research hospital, 555 radiographs (220 boys and 335 girls) were collected. The reference BA was determined via the Greulich and Pyle (GP) method by two radiologists in consensus. The BA was then estimated to use a deep learning model specifically developed for this population. Model performance was evaluated using multiple metrics: Mean square error (MSE), mean absolute error (MAE), intra-class correlation coefficient (ICC), and 95% limits of agreement (LoA). Gender-specific results were analyzed separately.</p><p><strong>Results: </strong>The model demonstrated acceptable accuracy. For boys, MSE was 0.55 years, MAE was 0.59 years, ICC was 0.74, and the 95% LoA ranged from -0.8 to 1.2 years. For girls, MSE was 0.59 years, MAE was 0.61 years, ICC was 0.82, and the 95% LoA ranged from -0.6 to 1.0 years. These results indicate stronger predictive accuracy for girls compared to boys.</p><p><strong>Conclusion: </strong>Our findings demonstrate that the proposed deep learning model achieves reasonable accuracy in BA assessment, with stronger performance in girls compared to boys. However, the relatively wide 95% LoA, particularly for boys, and prediction errors at the extremes of the age range highlight the need for further refinement and validation. While the model shows potential as a supplementary tool for clinicians, future studies should focus on improving prediction accuracy, reducing variability, and validating the model on larger, more diverse datasets before considering widespread clinical implementation. Additionally, addressing edge cases and specific conditions that a human reviewer may detect but the model might overlook, will be essential for enhancing its clinical reliability.</p>","PeriodicalId":55469,"journal":{"name":"Archives of Iranian Medicine","volume":"28 4","pages":"198-206"},"PeriodicalIF":1.0000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12085795/pdf/","citationCount":"0","resultStr":"{\"title\":\"Potential of Artificial Intelligence for Bone Age Assessment in Iranian Children and Adolescents: An Exploratory Study.\",\"authors\":\"Mehrzad Lotfi, Nahid Abolpour, Mohammadreza Ghasemi, Hajar Heydari, Reza Pourghayumi\",\"doi\":\"10.34172/aim.32070\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>To investigate whether the bone age (BA) of Iranian children could be accurately assessed via an artificial intelligence (AI) system. Accurate assessment of skeletal maturity is crucial for diagnosing and treating various musculoskeletal disorders, and is traditionally achieved through manual comparison with the Greulich-Pyle atlas. This process, however, is subjective and time-consuming. Recent advances in deep learning offer more efficient and consistent BA evaluations.</p><p><strong>Methods: </strong>From left-hand radiographs of children aged 1-18 years who presented to a tertiary research hospital, 555 radiographs (220 boys and 335 girls) were collected. The reference BA was determined via the Greulich and Pyle (GP) method by two radiologists in consensus. The BA was then estimated to use a deep learning model specifically developed for this population. Model performance was evaluated using multiple metrics: Mean square error (MSE), mean absolute error (MAE), intra-class correlation coefficient (ICC), and 95% limits of agreement (LoA). Gender-specific results were analyzed separately.</p><p><strong>Results: </strong>The model demonstrated acceptable accuracy. For boys, MSE was 0.55 years, MAE was 0.59 years, ICC was 0.74, and the 95% LoA ranged from -0.8 to 1.2 years. For girls, MSE was 0.59 years, MAE was 0.61 years, ICC was 0.82, and the 95% LoA ranged from -0.6 to 1.0 years. These results indicate stronger predictive accuracy for girls compared to boys.</p><p><strong>Conclusion: </strong>Our findings demonstrate that the proposed deep learning model achieves reasonable accuracy in BA assessment, with stronger performance in girls compared to boys. However, the relatively wide 95% LoA, particularly for boys, and prediction errors at the extremes of the age range highlight the need for further refinement and validation. While the model shows potential as a supplementary tool for clinicians, future studies should focus on improving prediction accuracy, reducing variability, and validating the model on larger, more diverse datasets before considering widespread clinical implementation. Additionally, addressing edge cases and specific conditions that a human reviewer may detect but the model might overlook, will be essential for enhancing its clinical reliability.</p>\",\"PeriodicalId\":55469,\"journal\":{\"name\":\"Archives of Iranian Medicine\",\"volume\":\"28 4\",\"pages\":\"198-206\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12085795/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Archives of Iranian Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.34172/aim.32070\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Iranian Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.34172/aim.32070","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:探讨人工智能(AI)系统能否准确评估伊朗儿童的骨龄(BA)。准确评估骨骼成熟度对于诊断和治疗各种肌肉骨骼疾病至关重要,传统上是通过与Greulich-Pyle图谱进行人工比较来实现的。然而,这个过程是主观的,耗时的。深度学习的最新进展提供了更有效和一致的BA评估。方法:收集某三级研究型医院收治的1 ~ 18岁儿童的左手x线片555张(男220张,女335张)。参考BA由两位放射科医生一致通过Greulich和Pyle (GP)方法确定。然后估计BA使用专门为这一人群开发的深度学习模型。使用多个指标评估模型性能:均方误差(MSE)、平均绝对误差(MAE)、类内相关系数(ICC)和95%一致限(LoA)。针对不同性别的结果分别进行分析。结果:该模型具有良好的准确性。男孩的MSE为0.55岁,MAE为0.59岁,ICC为0.74岁,95% LoA范围为-0.8 ~ 1.2岁。女孩的MSE为0.59年,MAE为0.61年,ICC为0.82年,95% LoA范围为-0.6至1.0年。这些结果表明,与男孩相比,女孩的预测准确性更高。结论:我们的研究结果表明,所提出的深度学习模型在BA评估中达到了合理的准确性,并且女生的表现优于男生。然而,相对较宽的95% LoA,特别是对于男孩,以及极端年龄范围的预测误差突出了进一步改进和验证的必要性。虽然该模型显示出作为临床医生补充工具的潜力,但未来的研究应侧重于提高预测准确性,减少可变性,并在考虑广泛的临床应用之前,在更大、更多样化的数据集上验证该模型。此外,解决人类审稿人可能检测到但模型可能忽略的边缘病例和特定条件,对于提高其临床可靠性至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Potential of Artificial Intelligence for Bone Age Assessment in Iranian Children and Adolescents: An Exploratory Study.

Background: To investigate whether the bone age (BA) of Iranian children could be accurately assessed via an artificial intelligence (AI) system. Accurate assessment of skeletal maturity is crucial for diagnosing and treating various musculoskeletal disorders, and is traditionally achieved through manual comparison with the Greulich-Pyle atlas. This process, however, is subjective and time-consuming. Recent advances in deep learning offer more efficient and consistent BA evaluations.

Methods: From left-hand radiographs of children aged 1-18 years who presented to a tertiary research hospital, 555 radiographs (220 boys and 335 girls) were collected. The reference BA was determined via the Greulich and Pyle (GP) method by two radiologists in consensus. The BA was then estimated to use a deep learning model specifically developed for this population. Model performance was evaluated using multiple metrics: Mean square error (MSE), mean absolute error (MAE), intra-class correlation coefficient (ICC), and 95% limits of agreement (LoA). Gender-specific results were analyzed separately.

Results: The model demonstrated acceptable accuracy. For boys, MSE was 0.55 years, MAE was 0.59 years, ICC was 0.74, and the 95% LoA ranged from -0.8 to 1.2 years. For girls, MSE was 0.59 years, MAE was 0.61 years, ICC was 0.82, and the 95% LoA ranged from -0.6 to 1.0 years. These results indicate stronger predictive accuracy for girls compared to boys.

Conclusion: Our findings demonstrate that the proposed deep learning model achieves reasonable accuracy in BA assessment, with stronger performance in girls compared to boys. However, the relatively wide 95% LoA, particularly for boys, and prediction errors at the extremes of the age range highlight the need for further refinement and validation. While the model shows potential as a supplementary tool for clinicians, future studies should focus on improving prediction accuracy, reducing variability, and validating the model on larger, more diverse datasets before considering widespread clinical implementation. Additionally, addressing edge cases and specific conditions that a human reviewer may detect but the model might overlook, will be essential for enhancing its clinical reliability.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Archives of Iranian Medicine
Archives of Iranian Medicine 医学-医学:内科
CiteScore
4.20
自引率
0.00%
发文量
67
审稿时长
3-8 weeks
期刊介绍: Aim and Scope: The Archives of Iranian Medicine (AIM) is a monthly peer-reviewed multidisciplinary medical publication. The journal welcomes contributions particularly relevant to the Middle-East region and publishes biomedical experiences and clinical investigations on prevalent diseases in the region as well as analyses of factors that may modulate the incidence, course, and management of diseases and pertinent medical problems. Manuscripts with didactic orientation and subjects exclusively of local interest will not be considered for publication.The 2016 Impact Factor of "Archives of Iranian Medicine" is 1.20.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信