从全景x线片估计牙齿年龄:使用伦敦地图集、Nolla和Haavikko方法对正畸医生和ChatGPT-4评估进行比较。

IF 3.3 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Derya Dursun, Rumeysa Bilici Geçer
{"title":"从全景x线片估计牙齿年龄:使用伦敦地图集、Nolla和Haavikko方法对正畸医生和ChatGPT-4评估进行比较。","authors":"Derya Dursun, Rumeysa Bilici Geçer","doi":"10.3390/diagnostics15182389","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Dental age (DA) estimation, which is widely used in orthodontics, pediatric dentistry, and forensic dentistry, predicts chronological age (CA) by assessing tooth development and maturation. Most methods rely on radiographic evaluation of tooth mineralization and eruption stages to assess DA. With the increasing adoption of large language models (LLMs) in medical sciences, use of ChatGPT has extended to processing visual data. The aim of this study, therefore, was to evaluate the performance of ChatGPT-4 in estimating DA from panoramic radiographs using three conventional methods (Nolla, Haavikko, and London Atlas) and to compare its accuracy against both orthodontist assessments and CA. <b>Methods:</b> In this retrospective study, panoramic radiographs of 511 Turkish children aged 6-17 years were assessed. DA was estimated using the Nolla, Haavikko, and London Atlas methods by both orthodontists and ChatGPT-4. The DA-CA difference and mean absolute error (MAE) were calculated, and statistical comparisons were performed to assess accuracy and sex differences and reach an agreement between the evaluators, with significance set at <i>p</i> < 0.05. <b>Results:</b> The mean CA of the study population was 12.37 ± 2.95 years (boys: 12.39 ± 2.94; girls: 12.35 ± 2.96). Using the London Atlas method, the orthodontists overestimated CA with a DA-CA difference of 0.78 ± 1.26 years (<i>p</i> < 0.001), whereas ChatGPT-4 showed no significant DA-CA difference (0.03 ± 0.93; <i>p</i> = 0.399). Using the Nolla method, the orthodontist showed no significant DA-CA difference (0.03 ± 1.14; <i>p</i> = 0.606), but ChatGPT-4 underestimated CA with a DA-CA difference of -0.40 ± 1.96 years (<i>p</i> < 0.001). Using the Haavikko method, the evaluators underestimated CA (orthodontist: -0.88; ChatGPT-4: -1.18; <i>p</i> < 0.001). The lowest MAE for ChatGPT-4 was obtained when using the London Atlas method (0.59 ± 0.72), followed by Nolla (1.33 ± 1.28) and Haavikko (1.51 ± 1.41). For the orthodontists, the lowest MAE was achieved when using the Nolla method (0.86 ± 0.75). Agreement between the orthodontists and ChatGPT-4 was highest when using the London Atlas method (ICC = 0.944, r = 0.905). <b>Conclusions:</b> ChatGPT-4 showed the highest accuracy with the London Atlas method, with no significant difference from CA for either sex or the lowest prediction error. When using the Nolla and Haavikko methods, both ChatGPT-4 and the orthodontist tended to underestimate age, with higher errors. Overall, ChatGPT-4 performed best when using visually guided methods and was less accurate when using multi-stage scoring methods.</p>","PeriodicalId":11225,"journal":{"name":"Diagnostics","volume":"15 18","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12468368/pdf/","citationCount":"0","resultStr":"{\"title\":\"Dental Age Estimation from Panoramic Radiographs: A Comparison of Orthodontist and ChatGPT-4 Evaluations Using the London Atlas, Nolla, and Haavikko Methods.\",\"authors\":\"Derya Dursun, Rumeysa Bilici Geçer\",\"doi\":\"10.3390/diagnostics15182389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background:</b> Dental age (DA) estimation, which is widely used in orthodontics, pediatric dentistry, and forensic dentistry, predicts chronological age (CA) by assessing tooth development and maturation. Most methods rely on radiographic evaluation of tooth mineralization and eruption stages to assess DA. With the increasing adoption of large language models (LLMs) in medical sciences, use of ChatGPT has extended to processing visual data. The aim of this study, therefore, was to evaluate the performance of ChatGPT-4 in estimating DA from panoramic radiographs using three conventional methods (Nolla, Haavikko, and London Atlas) and to compare its accuracy against both orthodontist assessments and CA. <b>Methods:</b> In this retrospective study, panoramic radiographs of 511 Turkish children aged 6-17 years were assessed. DA was estimated using the Nolla, Haavikko, and London Atlas methods by both orthodontists and ChatGPT-4. The DA-CA difference and mean absolute error (MAE) were calculated, and statistical comparisons were performed to assess accuracy and sex differences and reach an agreement between the evaluators, with significance set at <i>p</i> < 0.05. <b>Results:</b> The mean CA of the study population was 12.37 ± 2.95 years (boys: 12.39 ± 2.94; girls: 12.35 ± 2.96). Using the London Atlas method, the orthodontists overestimated CA with a DA-CA difference of 0.78 ± 1.26 years (<i>p</i> < 0.001), whereas ChatGPT-4 showed no significant DA-CA difference (0.03 ± 0.93; <i>p</i> = 0.399). Using the Nolla method, the orthodontist showed no significant DA-CA difference (0.03 ± 1.14; <i>p</i> = 0.606), but ChatGPT-4 underestimated CA with a DA-CA difference of -0.40 ± 1.96 years (<i>p</i> < 0.001). Using the Haavikko method, the evaluators underestimated CA (orthodontist: -0.88; ChatGPT-4: -1.18; <i>p</i> < 0.001). The lowest MAE for ChatGPT-4 was obtained when using the London Atlas method (0.59 ± 0.72), followed by Nolla (1.33 ± 1.28) and Haavikko (1.51 ± 1.41). For the orthodontists, the lowest MAE was achieved when using the Nolla method (0.86 ± 0.75). Agreement between the orthodontists and ChatGPT-4 was highest when using the London Atlas method (ICC = 0.944, r = 0.905). <b>Conclusions:</b> ChatGPT-4 showed the highest accuracy with the London Atlas method, with no significant difference from CA for either sex or the lowest prediction error. When using the Nolla and Haavikko methods, both ChatGPT-4 and the orthodontist tended to underestimate age, with higher errors. Overall, ChatGPT-4 performed best when using visually guided methods and was less accurate when using multi-stage scoring methods.</p>\",\"PeriodicalId\":11225,\"journal\":{\"name\":\"Diagnostics\",\"volume\":\"15 18\",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12468368/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Diagnostics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/diagnostics15182389\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/diagnostics15182389","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:牙龄(DA)估算通过评估牙齿发育和成熟来预测牙实足年龄(CA),广泛应用于正畸学、儿科牙科和法医牙科。大多数方法依靠x线摄影评估牙齿矿化和出牙阶段来评估DA。随着医学科学越来越多地采用大型语言模型(llm), ChatGPT的使用已经扩展到处理视觉数据。因此,本研究的目的是评估ChatGPT-4使用三种传统方法(Nolla、Haavikko和London Atlas)估计全景x线片DA的性能,并将其与正畸医生评估和CA的准确性进行比较。方法:在这项回顾性研究中,评估了511名6-17岁的土耳其儿童的全景x线片。正畸医师和ChatGPT-4使用Nolla, Haavikko和London Atlas方法估计DA。计算DA-CA差和平均绝对误差(MAE),并进行统计学比较,评估准确性和性别差异,评估者之间达成一致,p < 0.05为显著性。结果:研究人群的平均CA为12.37±2.95岁(男孩12.39±2.94,女孩12.35±2.96)。使用伦敦地图集法,正畸医师对CA的估计过高,DA-CA差值为0.78±1.26年(p < 0.001),而ChatGPT-4无显著差异(0.03±0.93;p = 0.399)。使用Nolla法,正畸医师的DA-CA差异无统计学意义(0.03±1.14;p = 0.606),但ChatGPT-4低估了CA, DA-CA差异为-0.40±1.96年(p < 0.001)。使用Haavikko方法,评估者低估了CA(正畸:-0.88;ChatGPT-4: -1.18; p < 0.001)。使用London Atlas方法对ChatGPT-4的MAE最低(0.59±0.72),其次是Nolla(1.33±1.28)和Haavikko(1.51±1.41)。正畸医师使用Nolla法获得的MAE最低(0.86±0.75)。正畸医师与ChatGPT-4的一致性在使用伦敦地图集方法时最高(ICC = 0.944, r = 0.905)。结论:ChatGPT-4在伦敦地图集方法中显示出最高的准确性,与CA在性别和预测误差方面均无显著差异。当使用Nolla和Haavikko方法时,ChatGPT-4和正畸医生都倾向于低估年龄,错误率更高。总体而言,ChatGPT-4在使用视觉引导方法时表现最佳,而在使用多阶段评分方法时准确性较低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dental Age Estimation from Panoramic Radiographs: A Comparison of Orthodontist and ChatGPT-4 Evaluations Using the London Atlas, Nolla, and Haavikko Methods.

Background: Dental age (DA) estimation, which is widely used in orthodontics, pediatric dentistry, and forensic dentistry, predicts chronological age (CA) by assessing tooth development and maturation. Most methods rely on radiographic evaluation of tooth mineralization and eruption stages to assess DA. With the increasing adoption of large language models (LLMs) in medical sciences, use of ChatGPT has extended to processing visual data. The aim of this study, therefore, was to evaluate the performance of ChatGPT-4 in estimating DA from panoramic radiographs using three conventional methods (Nolla, Haavikko, and London Atlas) and to compare its accuracy against both orthodontist assessments and CA. Methods: In this retrospective study, panoramic radiographs of 511 Turkish children aged 6-17 years were assessed. DA was estimated using the Nolla, Haavikko, and London Atlas methods by both orthodontists and ChatGPT-4. The DA-CA difference and mean absolute error (MAE) were calculated, and statistical comparisons were performed to assess accuracy and sex differences and reach an agreement between the evaluators, with significance set at p < 0.05. Results: The mean CA of the study population was 12.37 ± 2.95 years (boys: 12.39 ± 2.94; girls: 12.35 ± 2.96). Using the London Atlas method, the orthodontists overestimated CA with a DA-CA difference of 0.78 ± 1.26 years (p < 0.001), whereas ChatGPT-4 showed no significant DA-CA difference (0.03 ± 0.93; p = 0.399). Using the Nolla method, the orthodontist showed no significant DA-CA difference (0.03 ± 1.14; p = 0.606), but ChatGPT-4 underestimated CA with a DA-CA difference of -0.40 ± 1.96 years (p < 0.001). Using the Haavikko method, the evaluators underestimated CA (orthodontist: -0.88; ChatGPT-4: -1.18; p < 0.001). The lowest MAE for ChatGPT-4 was obtained when using the London Atlas method (0.59 ± 0.72), followed by Nolla (1.33 ± 1.28) and Haavikko (1.51 ± 1.41). For the orthodontists, the lowest MAE was achieved when using the Nolla method (0.86 ± 0.75). Agreement between the orthodontists and ChatGPT-4 was highest when using the London Atlas method (ICC = 0.944, r = 0.905). Conclusions: ChatGPT-4 showed the highest accuracy with the London Atlas method, with no significant difference from CA for either sex or the lowest prediction error. When using the Nolla and Haavikko methods, both ChatGPT-4 and the orthodontist tended to underestimate age, with higher errors. Overall, ChatGPT-4 performed best when using visually guided methods and was less accurate when using multi-stage scoring methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Diagnostics
Diagnostics Biochemistry, Genetics and Molecular Biology-Clinical Biochemistry
CiteScore
4.70
自引率
8.30%
发文量
2699
审稿时长
19.64 days
期刊介绍: Diagnostics (ISSN 2075-4418) is an international scholarly open access journal on medical diagnostics. It publishes original research articles, reviews, communications and short notes on the research and development of medical diagnostics. There is no restriction on the length of the papers. Our aim is to encourage scientists to publish their experimental and theoretical research in as much detail as possible. Full experimental and/or methodological details must be provided for research articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信