Artificial intelligence in sleep medicine: assessing the diagnostic precision of ChatGPT-4.

IF 2.9 3区 医学 Q1 CLINICAL NEUROLOGY
Anshum Patel, Joseph Cheung
{"title":"Artificial intelligence in sleep medicine: assessing the diagnostic precision of ChatGPT-4.","authors":"Anshum Patel, Joseph Cheung","doi":"10.5664/jcsm.11732","DOIUrl":null,"url":null,"abstract":"<p><strong>Study objectives: </strong>Large language models such as ChatGPT-4 are emerging in medicine, including sleep medicine, where artificial intelligence is used to analyze sleep data and predict treatment outcomes. Effectiveness of large language models in accurately diagnosing sleep disorders based on clinical history has not yet been studied. This study evaluates ChatGPT-4's diagnostic performance using clinical vignettes.</p><p><strong>Methods: </strong>Nineteen clinical vignettes containing patient history, physical examination findings, and diagnostic tests from the <i>Case Book of Sleep Medicine</i> (third edition, 2019, American Academy of Sleep Medicine) were presented to ChatGPT-4. Its differential and final diagnoses were compared to reference diagnoses, with accuracy assessed by (1) the percentage of correct differentials and (2) a 3-tier scoring system (no match, partial match, full match) for final diagnoses.</p><p><strong>Results: </strong>The mean accuracy for differential diagnoses was 63.27% ± 15.61% (standard deviation), ranging from 33.33-100%. The mean number of artificial intelligence-generated differential diagnoses matching the American Academy of Sleep Medicine case differential diagnoses was 2.79 ± 0.71 (standard deviation). For final diagnoses, ChatGPT-4 scored a total of 30 out of a possible 38, resulting in an overall accuracy of 78.95%. The model achieved a mean score of 1.58 ± 0.61 (standard deviation) out of 2, with 68.42% of cases achieving a full match. Performance was higher in cases with fewer differential diagnoses, whereas accuracy decreased in complex cases.</p><p><strong>Conclusions: </strong>ChatGPT-4 demonstrates promising diagnostic potential in sleep medicine, with moderate to high accuracy in identifying differential and final diagnoses, although its variability in more complex cases calls for refinement and clinical validation.</p><p><strong>Citation: </strong>Patel A, Cheung J. Artificial intelligence in sleep medicine: assessing the diagnostic precision of ChatGPT-4. <i>J Clin Sleep Med.</i> 2025;21(9):1511-1517.</p>","PeriodicalId":50233,"journal":{"name":"Journal of Clinical Sleep Medicine","volume":" ","pages":"1511-1517"},"PeriodicalIF":2.9000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12406831/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Sleep Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.5664/jcsm.11732","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Study objectives: Large language models such as ChatGPT-4 are emerging in medicine, including sleep medicine, where artificial intelligence is used to analyze sleep data and predict treatment outcomes. Effectiveness of large language models in accurately diagnosing sleep disorders based on clinical history has not yet been studied. This study evaluates ChatGPT-4's diagnostic performance using clinical vignettes.

Methods: Nineteen clinical vignettes containing patient history, physical examination findings, and diagnostic tests from the Case Book of Sleep Medicine (third edition, 2019, American Academy of Sleep Medicine) were presented to ChatGPT-4. Its differential and final diagnoses were compared to reference diagnoses, with accuracy assessed by (1) the percentage of correct differentials and (2) a 3-tier scoring system (no match, partial match, full match) for final diagnoses.

Results: The mean accuracy for differential diagnoses was 63.27% ± 15.61% (standard deviation), ranging from 33.33-100%. The mean number of artificial intelligence-generated differential diagnoses matching the American Academy of Sleep Medicine case differential diagnoses was 2.79 ± 0.71 (standard deviation). For final diagnoses, ChatGPT-4 scored a total of 30 out of a possible 38, resulting in an overall accuracy of 78.95%. The model achieved a mean score of 1.58 ± 0.61 (standard deviation) out of 2, with 68.42% of cases achieving a full match. Performance was higher in cases with fewer differential diagnoses, whereas accuracy decreased in complex cases.

Conclusions: ChatGPT-4 demonstrates promising diagnostic potential in sleep medicine, with moderate to high accuracy in identifying differential and final diagnoses, although its variability in more complex cases calls for refinement and clinical validation.

Citation: Patel A, Cheung J. Artificial intelligence in sleep medicine: assessing the diagnostic precision of ChatGPT-4. J Clin Sleep Med. 2025;21(9):1511-1517.

人工智能在睡眠医学中的应用:ChatGPT-4的诊断精度评估。
研究目标:像ChatGPT-4这样的大型语言模型(llm)正在医学领域出现,包括睡眠医学,其中人工智能(AI)被用来分析睡眠数据并预测治疗结果。LLM在基于临床病史准确诊断睡眠障碍方面的有效性尚未得到研究。本研究使用临床小片段评估ChatGPT-4的诊断性能。方法:向ChatGPT-4提交19个临床小片段,其中包含来自《睡眠医学病例手册》(2019年第3版,AASM)的患者病史、体格检查结果和诊断测试。将其鉴别诊断和最终诊断与参考诊断进行比较,准确性通过(1)鉴别诊断正确百分比和(2)最终诊断的三级评分系统(不匹配、部分匹配、完全匹配)来评估。结果:鉴别诊断的平均准确率为63.27%±15.61% (SD),范围为33.33% ~ 100%。人工智能生成的鉴别诊断与AASM病例鉴别诊断相匹配的平均数量为2.79±0.71 (SD)。对于最终诊断,ChatGPT-4在可能的38分中得分为30分,总体准确率为78.95%。模型的平均得分为1.58±0.61 (SD),满分为2分,68.42%的病例完全匹配。在鉴别诊断较少的情况下,表现更高,而在复杂情况下,准确性下降。结论:ChatGPT-4在睡眠医学中具有良好的诊断潜力,在鉴别鉴别和最终诊断方面具有中高准确率。尽管其在更复杂的情况下的可变性需要改进和临床验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.20
自引率
7.00%
发文量
321
审稿时长
1 months
期刊介绍: Journal of Clinical Sleep Medicine focuses on clinical sleep medicine. Its emphasis is publication of papers with direct applicability and/or relevance to the clinical practice of sleep medicine. This includes clinical trials, clinical reviews, clinical commentary and debate, medical economic/practice perspectives, case series and novel/interesting case reports. In addition, the journal will publish proceedings from conferences, workshops and symposia sponsored by the American Academy of Sleep Medicine or other organizations related to improving the practice of sleep medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信