评估ChatGPT对神经认知障碍的诊断:一项多中心研究。

IF 2.7 3区 心理学 Q2 CLINICAL NEUROLOGY
A Andrew Dimmick, Charlie C Su, Hanan S Rafiuddin, David C Cicero
{"title":"评估ChatGPT对神经认知障碍的诊断:一项多中心研究。","authors":"A Andrew Dimmick, Charlie C Su, Hanan S Rafiuddin, David C Cicero","doi":"10.1080/13854046.2025.2475567","DOIUrl":null,"url":null,"abstract":"<p><p><b>Objective</b>: To evaluate the accuracy and reliability of ChatGPT 4 Omni in diagnosing neurocognitive disorders using comprehensive clinical data and compare its performance to previous versions of ChatGPT. <b>Method</b>: This project utilized a two-part design: Study 1 examined diagnostic agreement between ChatGPT 4 Omni and clinicians using a few-shot prompt approach, and Study 2 compared the diagnostic performance of ChatGPT models using a zero-shot prompt approach using data from the National Alzheimer's Coordinating Center (NACC) Uniform Data Set 3. Study 1 included 12,922 older adults (<i>M<sub>age</sub></i> = 69.13, <i>SD</i> = 9.87), predominantly female (57%) and White (80%). Study 2 involved 537 older adults (<i>M<sub>age</sub></i> = 67.88, <i>SD</i> = 9.52), majority female (57%) and White (81%). Diagnoses included no cognitive impairment, amnestic mild cognitive impairment (MCI), nonamnestic MCI, and dementia. <b>Results</b>: In Study 1, ChatGPT 4 Omni showed fair association with clinician diagnoses (χ2 (9) = 6021.96, <i>p</i> < .001; κ = .33). Notable predictive measures of agreement included the MoCA and memory recall tests. ChatGPT 4 Omni demonstrated high internal reliability (α = .96). In Study 2, no significant diagnostic agreement was found between ChatGPT versions and clinicians. <b>Conclusions</b>: Although ChatGPT 4 Omni shows potential in aligning with clinician diagnoses, its diagnostic accuracy is insufficient for clinical application without human oversight. Continued refinement and comprehensive training of AI models are essential to enhance their utility in neuropsychological assessment. With rapidly developing technological innovations, integrating AI tools in clinical practice could soon improve diagnostic efficiency and accessibility to neuropsychological services.</p>","PeriodicalId":55250,"journal":{"name":"Clinical Neuropsychologist","volume":" ","pages":"1-16"},"PeriodicalIF":2.7000,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating ChatGPT for neurocognitive disorder diagnosis: a multicenter study.\",\"authors\":\"A Andrew Dimmick, Charlie C Su, Hanan S Rafiuddin, David C Cicero\",\"doi\":\"10.1080/13854046.2025.2475567\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Objective</b>: To evaluate the accuracy and reliability of ChatGPT 4 Omni in diagnosing neurocognitive disorders using comprehensive clinical data and compare its performance to previous versions of ChatGPT. <b>Method</b>: This project utilized a two-part design: Study 1 examined diagnostic agreement between ChatGPT 4 Omni and clinicians using a few-shot prompt approach, and Study 2 compared the diagnostic performance of ChatGPT models using a zero-shot prompt approach using data from the National Alzheimer's Coordinating Center (NACC) Uniform Data Set 3. Study 1 included 12,922 older adults (<i>M<sub>age</sub></i> = 69.13, <i>SD</i> = 9.87), predominantly female (57%) and White (80%). Study 2 involved 537 older adults (<i>M<sub>age</sub></i> = 67.88, <i>SD</i> = 9.52), majority female (57%) and White (81%). Diagnoses included no cognitive impairment, amnestic mild cognitive impairment (MCI), nonamnestic MCI, and dementia. <b>Results</b>: In Study 1, ChatGPT 4 Omni showed fair association with clinician diagnoses (χ2 (9) = 6021.96, <i>p</i> < .001; κ = .33). Notable predictive measures of agreement included the MoCA and memory recall tests. ChatGPT 4 Omni demonstrated high internal reliability (α = .96). In Study 2, no significant diagnostic agreement was found between ChatGPT versions and clinicians. <b>Conclusions</b>: Although ChatGPT 4 Omni shows potential in aligning with clinician diagnoses, its diagnostic accuracy is insufficient for clinical application without human oversight. Continued refinement and comprehensive training of AI models are essential to enhance their utility in neuropsychological assessment. With rapidly developing technological innovations, integrating AI tools in clinical practice could soon improve diagnostic efficiency and accessibility to neuropsychological services.</p>\",\"PeriodicalId\":55250,\"journal\":{\"name\":\"Clinical Neuropsychologist\",\"volume\":\" \",\"pages\":\"1-16\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Neuropsychologist\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1080/13854046.2025.2475567\",\"RegionNum\":3,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Neuropsychologist","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1080/13854046.2025.2475567","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的:综合临床资料评价ChatGPT 4 Omni诊断神经认知障碍的准确性和可靠性,并与以往版本的ChatGPT进行比较。方法:本项目采用两部分设计:研究1使用少量提示方法检查ChatGPT 4 Omni与临床医生之间的诊断一致性,研究2使用来自国家阿尔茨海默病协调中心(NACC)统一数据集3的数据使用零提示方法比较ChatGPT模型的诊断性能。研究1包括12,922名老年人(Mage = 69.13, SD = 9.87),主要为女性(57%)和白人(80%)。研究2涉及537名老年人(Mage = 67.88, SD = 9.52),大多数为女性(57%)和白人(81%)。诊断包括无认知障碍、遗忘性轻度认知障碍(MCI)、非遗忘性轻度认知障碍和痴呆。结果:研究1中,ChatGPT 4 Omni与临床诊断呈正相关(χ2 (9) = 6021.96, p < 0.001;κ = 0.33)。值得注意的一致性预测措施包括MoCA和记忆回忆测试。ChatGPT 4 Omni具有较高的内部信度(α = 0.96)。在研究2中,ChatGPT版本和临床医生之间没有发现显著的诊断一致性。结论:尽管ChatGPT 4 Omni显示出与临床医生诊断一致的潜力,但其诊断准确性在没有人为监督的情况下不足以用于临床应用。不断完善和全面训练人工智能模型对于提高其在神经心理学评估中的效用至关重要。随着技术创新的快速发展,将人工智能工具整合到临床实践中,很快就会提高诊断效率和神经心理学服务的可及性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating ChatGPT for neurocognitive disorder diagnosis: a multicenter study.

Objective: To evaluate the accuracy and reliability of ChatGPT 4 Omni in diagnosing neurocognitive disorders using comprehensive clinical data and compare its performance to previous versions of ChatGPT. Method: This project utilized a two-part design: Study 1 examined diagnostic agreement between ChatGPT 4 Omni and clinicians using a few-shot prompt approach, and Study 2 compared the diagnostic performance of ChatGPT models using a zero-shot prompt approach using data from the National Alzheimer's Coordinating Center (NACC) Uniform Data Set 3. Study 1 included 12,922 older adults (Mage = 69.13, SD = 9.87), predominantly female (57%) and White (80%). Study 2 involved 537 older adults (Mage = 67.88, SD = 9.52), majority female (57%) and White (81%). Diagnoses included no cognitive impairment, amnestic mild cognitive impairment (MCI), nonamnestic MCI, and dementia. Results: In Study 1, ChatGPT 4 Omni showed fair association with clinician diagnoses (χ2 (9) = 6021.96, p < .001; κ = .33). Notable predictive measures of agreement included the MoCA and memory recall tests. ChatGPT 4 Omni demonstrated high internal reliability (α = .96). In Study 2, no significant diagnostic agreement was found between ChatGPT versions and clinicians. Conclusions: Although ChatGPT 4 Omni shows potential in aligning with clinician diagnoses, its diagnostic accuracy is insufficient for clinical application without human oversight. Continued refinement and comprehensive training of AI models are essential to enhance their utility in neuropsychological assessment. With rapidly developing technological innovations, integrating AI tools in clinical practice could soon improve diagnostic efficiency and accessibility to neuropsychological services.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Neuropsychologist
Clinical Neuropsychologist 医学-临床神经学
CiteScore
8.40
自引率
12.80%
发文量
61
审稿时长
6-12 weeks
期刊介绍: The Clinical Neuropsychologist (TCN) serves as the premier forum for (1) state-of-the-art clinically-relevant scientific research, (2) in-depth professional discussions of matters germane to evidence-based practice, and (3) clinical case studies in neuropsychology. Of particular interest are papers that can make definitive statements about a given topic (thereby having implications for the standards of clinical practice) and those with the potential to expand today’s clinical frontiers. Research on all age groups, and on both clinical and normal populations, is considered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信