Assessing ChatGPT Ability to Answer Frequently Asked Questions About Essential Tremor.

IF 2.5 Q2 CLINICAL NEUROLOGY
Tremor and Other Hyperkinetic Movements Pub Date : 2024-07-03 eCollection Date: 2024-01-01 DOI:10.5334/tohm.917
Cristiano Sorrentino, Vincenzo Canoro, Maria Russo, Caterina Giordano, Paolo Barone, Roberto Erro
{"title":"Assessing ChatGPT Ability to Answer Frequently Asked Questions About Essential Tremor.","authors":"Cristiano Sorrentino, Vincenzo Canoro, Maria Russo, Caterina Giordano, Paolo Barone, Roberto Erro","doi":"10.5334/tohm.917","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Large-language models (LLMs) driven by artificial intelligence allow people to engage in direct conversations about their health. The accuracy and readability of the answers provided by ChatGPT, the most famous LLM, about Essential Tremor (ET), one of the commonest movement disorders, have not yet been evaluated.</p><p><strong>Methods: </strong>Answers given by ChatGPT to 10 questions about ET were evaluated by 5 professionals and 15 laypeople with a score ranging from 1 (poor) to 5 (excellent) in terms of clarity, relevance, accuracy (only for professionals), comprehensiveness, and overall value of the response. We further calculated the readability of the answers.</p><p><strong>Results: </strong>ChatGPT answers received relatively positive evaluations, with median scores ranging between 4 and 5, by both groups and independently from the type of question. However, there was only moderate agreement between raters, especially in the group of professionals. Moreover, readability levels were poor for all examined answers.</p><p><strong>Discussion: </strong>ChatGPT provided relatively accurate and relevant answers, with some variability as judged by the group of professionals suggesting that the degree of literacy about ET has influenced the ratings and, indirectly, that the quality of information provided in clinical practice is also variable. Moreover, the readability of the answer provided by ChatGPT was found to be poor. LLMs will likely play a significant role in the future; therefore, health-related content generated by these tools should be monitored.</p>","PeriodicalId":23317,"journal":{"name":"Tremor and Other Hyperkinetic Movements","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11225576/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tremor and Other Hyperkinetic Movements","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5334/tohm.917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Large-language models (LLMs) driven by artificial intelligence allow people to engage in direct conversations about their health. The accuracy and readability of the answers provided by ChatGPT, the most famous LLM, about Essential Tremor (ET), one of the commonest movement disorders, have not yet been evaluated.

Methods: Answers given by ChatGPT to 10 questions about ET were evaluated by 5 professionals and 15 laypeople with a score ranging from 1 (poor) to 5 (excellent) in terms of clarity, relevance, accuracy (only for professionals), comprehensiveness, and overall value of the response. We further calculated the readability of the answers.

Results: ChatGPT answers received relatively positive evaluations, with median scores ranging between 4 and 5, by both groups and independently from the type of question. However, there was only moderate agreement between raters, especially in the group of professionals. Moreover, readability levels were poor for all examined answers.

Discussion: ChatGPT provided relatively accurate and relevant answers, with some variability as judged by the group of professionals suggesting that the degree of literacy about ET has influenced the ratings and, indirectly, that the quality of information provided in clinical practice is also variable. Moreover, the readability of the answer provided by ChatGPT was found to be poor. LLMs will likely play a significant role in the future; therefore, health-related content generated by these tools should be monitored.

评估 ChatGPT 回答有关本质性震颤的常见问题的能力。
背景:由人工智能驱动的大型语言模型(LLM)可以让人们就自己的健康进行直接对话。最著名的大型语言模型 ChatGPT 提供的有关最常见运动障碍之一的本质性震颤(ET)的答案的准确性和可读性尚未得到评估:5 位专业人士和 15 位非专业人士对 ChatGPT 回答的 10 个有关 ET 的问题进行了评估,从清晰度、相关性、准确性(仅针对专业人士)、全面性和回答的整体价值等方面给出了 1 分(差)到 5 分(优)不等的分数。我们还进一步计算了答案的可读性:结果:聊天 GPT 答案获得了相对积极的评价,中位数在 4 分至 5 分之间,由两组人打分,与问题类型无关。然而,评分者之间的一致性不高,尤其是在专业人士组中。此外,所有受检答案的可读性都较差:讨论:ChatGPT 提供了相对准确和相关的答案,但专业人士组的判断存在一定的差异,这表明对 ET 的了解程度影响了评分,并间接表明临床实践中提供的信息质量也存在差异。此外,ChatGPT 提供的答案可读性较差。LLM 在未来可能会发挥重要作用;因此,应该对这些工具生成的健康相关内容进行监测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.00
自引率
4.50%
发文量
31
审稿时长
6 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信