人工智能在临床神经学查询中的表现:ChatGPT 模型。

IF 1.7 4区 医学 Q3 CLINICAL NEUROLOGY
Neurological Research Pub Date : 2024-05-01 Epub Date: 2024-03-24 DOI:10.1080/01616412.2024.2334118
Erman Altunisik, Yasemin Ekmekyapar Firat, Emine Kilicparlar Cengiz, Gulsum Bayana Comruk
{"title":"人工智能在临床神经学查询中的表现:ChatGPT 模型。","authors":"Erman Altunisik, Yasemin Ekmekyapar Firat, Emine Kilicparlar Cengiz, Gulsum Bayana Comruk","doi":"10.1080/01616412.2024.2334118","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The use of artificial intelligence technology is progressively expanding and advancing in the health and biomedical literature. Since its launch, ChatGPT has rapidly gained popularity and become one of the fastest-growing artificial intelligence applications in history. This study evaluated the accuracy and comprehensiveness of ChatGPT-generated responses to medical queries in clinical neurology.</p><p><strong>Methods: </strong>We directed 216 questions from different subspecialties to ChatGPT. The questions were classified into three categories: multiple-choice, descriptive, and binary (yes/no answers). Each question in all categories was subjectively rated as easy, medium, or hard according to its difficulty level. Questions that also tested for intuitive clinical thinking and reasoning ability were evaluated in a separate category.</p><p><strong>Results: </strong>ChatGPT correctly answered 141 questions (65.3%). No significant difference was detected in the accuracy and comprehensiveness scale scores or correct answer rates in comparisons made according to the question style or difficulty level. However, a comparative analysis assessing question characteristics revealed significantly lower accuracy and comprehensiveness scale scores and correct answer rates for questions based on interpretations that required critical thinking (<i>p</i> = 0.007, 0.007, and 0.001, respectively).</p><p><strong>Conclusion: </strong>ChatGPT had a moderate overall performance in clinical neurology and demonstrated inadequate performance in answering questions that required interpretation and critical thinking. It also displayed limited performance in specific subspecialties. It is essential to acknowledge the limitations of artificial intelligence and diligently verify medical information produced by such models using reliable sources.</p>","PeriodicalId":19131,"journal":{"name":"Neurological Research","volume":" ","pages":"437-443"},"PeriodicalIF":1.7000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificial intelligence performance in clinical neurology queries: the ChatGPT model.\",\"authors\":\"Erman Altunisik, Yasemin Ekmekyapar Firat, Emine Kilicparlar Cengiz, Gulsum Bayana Comruk\",\"doi\":\"10.1080/01616412.2024.2334118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>The use of artificial intelligence technology is progressively expanding and advancing in the health and biomedical literature. Since its launch, ChatGPT has rapidly gained popularity and become one of the fastest-growing artificial intelligence applications in history. This study evaluated the accuracy and comprehensiveness of ChatGPT-generated responses to medical queries in clinical neurology.</p><p><strong>Methods: </strong>We directed 216 questions from different subspecialties to ChatGPT. The questions were classified into three categories: multiple-choice, descriptive, and binary (yes/no answers). Each question in all categories was subjectively rated as easy, medium, or hard according to its difficulty level. Questions that also tested for intuitive clinical thinking and reasoning ability were evaluated in a separate category.</p><p><strong>Results: </strong>ChatGPT correctly answered 141 questions (65.3%). No significant difference was detected in the accuracy and comprehensiveness scale scores or correct answer rates in comparisons made according to the question style or difficulty level. However, a comparative analysis assessing question characteristics revealed significantly lower accuracy and comprehensiveness scale scores and correct answer rates for questions based on interpretations that required critical thinking (<i>p</i> = 0.007, 0.007, and 0.001, respectively).</p><p><strong>Conclusion: </strong>ChatGPT had a moderate overall performance in clinical neurology and demonstrated inadequate performance in answering questions that required interpretation and critical thinking. It also displayed limited performance in specific subspecialties. It is essential to acknowledge the limitations of artificial intelligence and diligently verify medical information produced by such models using reliable sources.</p>\",\"PeriodicalId\":19131,\"journal\":{\"name\":\"Neurological Research\",\"volume\":\" \",\"pages\":\"437-443\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurological Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/01616412.2024.2334118\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/3/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurological Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/01616412.2024.2334118","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/24 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

引言人工智能技术在健康和生物医学领域的应用正在逐步扩大和发展。自推出以来,ChatGPT 已迅速普及,成为历史上增长最快的人工智能应用之一。本研究评估了 ChatGPT 生成的对临床神经病学医疗询问的回复的准确性和全面性:我们向 ChatGPT 提出了 216 个来自不同亚专科的问题。这些问题分为三类:多选题、描述性问题和二进制问题(是/否答案)。所有类别中的每个问题都根据其难度被主观评定为易、中或难。同时测试临床直觉思维和推理能力的问题被单独归为一类:ChatGPT 正确回答了 141 个问题(65.3%)。在根据问题风格或难度进行的比较中,未发现准确性和全面性量表得分或正确回答率存在明显差异。然而,对问题特征进行的比较分析表明,基于需要批判性思维的解释的问题的准确性和全面性量表得分以及正确答案率明显较低(p = 0.007、0.007 和 0.001):结论:ChatGPT 在临床神经病学方面的总体表现一般,在回答需要解释和批判性思维的问题时表现不佳。它在特定亚专科的表现也很有限。必须认识到人工智能的局限性,并使用可靠的资料来源认真核实此类模型生成的医学信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Artificial intelligence performance in clinical neurology queries: the ChatGPT model.

Introduction: The use of artificial intelligence technology is progressively expanding and advancing in the health and biomedical literature. Since its launch, ChatGPT has rapidly gained popularity and become one of the fastest-growing artificial intelligence applications in history. This study evaluated the accuracy and comprehensiveness of ChatGPT-generated responses to medical queries in clinical neurology.

Methods: We directed 216 questions from different subspecialties to ChatGPT. The questions were classified into three categories: multiple-choice, descriptive, and binary (yes/no answers). Each question in all categories was subjectively rated as easy, medium, or hard according to its difficulty level. Questions that also tested for intuitive clinical thinking and reasoning ability were evaluated in a separate category.

Results: ChatGPT correctly answered 141 questions (65.3%). No significant difference was detected in the accuracy and comprehensiveness scale scores or correct answer rates in comparisons made according to the question style or difficulty level. However, a comparative analysis assessing question characteristics revealed significantly lower accuracy and comprehensiveness scale scores and correct answer rates for questions based on interpretations that required critical thinking (p = 0.007, 0.007, and 0.001, respectively).

Conclusion: ChatGPT had a moderate overall performance in clinical neurology and demonstrated inadequate performance in answering questions that required interpretation and critical thinking. It also displayed limited performance in specific subspecialties. It is essential to acknowledge the limitations of artificial intelligence and diligently verify medical information produced by such models using reliable sources.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurological Research
Neurological Research 医学-临床神经学
CiteScore
3.60
自引率
0.00%
发文量
116
审稿时长
5.3 months
期刊介绍: Neurological Research is an international, peer-reviewed journal for reporting both basic and clinical research in the fields of neurosurgery, neurology, neuroengineering and neurosciences. It provides a medium for those who recognize the wider implications of their work and who wish to be informed of the relevant experience of others in related and more distant fields. The scope of the journal includes: •Stem cell applications •Molecular neuroscience •Neuropharmacology •Neuroradiology •Neurochemistry •Biomathematical models •Endovascular neurosurgery •Innovation in neurosurgery.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信