用于耳鸣信息和支持的 ChatGPT:回复准确性和三个月后的重新测试

W. Wiktor Jedrzejczak, Piotr H. Skarzynski, Danuta Raj-Koziak, Milaine Dominici Sanfins, Stavros Hatzopoulos, Krzysztof Kochanek
{"title":"用于耳鸣信息和支持的 ChatGPT:回复准确性和三个月后的重新测试","authors":"W. Wiktor Jedrzejczak, Piotr H. Skarzynski, Danuta Raj-Koziak, Milaine Dominici Sanfins, Stavros Hatzopoulos, Krzysztof Kochanek","doi":"10.1101/2023.12.19.23300189","DOIUrl":null,"url":null,"abstract":"Background: ChatGPT - a conversational tool based on artificial intelligence - has recently been tested on a range of topics. However most of the testing has involved broad domains of knowledge. Here we test ChatGPT's knowledge of tinnitus, an important but specialized aspect of audiology and otolaryngology. Testing involved evaluating ChatGPT's answers to a defined set of 10 questions on tinnitus. Furthermore, given the technology is advancing quickly, we re-evaluated the responses to the same 10 questions 3 months later.\nMaterial and method: ChatGPT (free version 3.5) was asked 10 questions on tinnitus at two points of time - August 2023 and November 2023. The accuracy of the responses was rated by 6 experts using a Likert scale ranging from 1 to 5. The number of words in each response was also counted, and responses were specifically examined for whether references were provided or whether consultation with a specialist was suggested.\nResults: Most of ChatGPT's responses were rated as satisfactory or better. However, we did detect a few instances where the responses were not accurate and might be considered somewhat misleading. The responses from ChatGPT were quite long (averaging over 400 words) and they occasionally tended to stray off-topic. No solid references to sources of information were ever supplied, and when references were specifically asked for the sources were artificial. For most responses consultation with a specialist was suggested. It is worth noting that after 3 months the responses generally improved.\nConclusions: ChatGPT provided surprisingly good responses, given that the questions were quite specific. Although no potentially harmful errors were identified, some mistakes could be seen as somewhat misleading. No solid references were ever supplied. ChatGPT shows great potential if further developed by experts in specific areas, but for now it is not yet ready for serious application.","PeriodicalId":501185,"journal":{"name":"medRxiv - Otolaryngology","volume":"66 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ChatGPT for tinnitus information and support: response accuracy and retest after three months\",\"authors\":\"W. Wiktor Jedrzejczak, Piotr H. Skarzynski, Danuta Raj-Koziak, Milaine Dominici Sanfins, Stavros Hatzopoulos, Krzysztof Kochanek\",\"doi\":\"10.1101/2023.12.19.23300189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: ChatGPT - a conversational tool based on artificial intelligence - has recently been tested on a range of topics. However most of the testing has involved broad domains of knowledge. Here we test ChatGPT's knowledge of tinnitus, an important but specialized aspect of audiology and otolaryngology. Testing involved evaluating ChatGPT's answers to a defined set of 10 questions on tinnitus. Furthermore, given the technology is advancing quickly, we re-evaluated the responses to the same 10 questions 3 months later.\\nMaterial and method: ChatGPT (free version 3.5) was asked 10 questions on tinnitus at two points of time - August 2023 and November 2023. The accuracy of the responses was rated by 6 experts using a Likert scale ranging from 1 to 5. The number of words in each response was also counted, and responses were specifically examined for whether references were provided or whether consultation with a specialist was suggested.\\nResults: Most of ChatGPT's responses were rated as satisfactory or better. However, we did detect a few instances where the responses were not accurate and might be considered somewhat misleading. The responses from ChatGPT were quite long (averaging over 400 words) and they occasionally tended to stray off-topic. No solid references to sources of information were ever supplied, and when references were specifically asked for the sources were artificial. For most responses consultation with a specialist was suggested. It is worth noting that after 3 months the responses generally improved.\\nConclusions: ChatGPT provided surprisingly good responses, given that the questions were quite specific. Although no potentially harmful errors were identified, some mistakes could be seen as somewhat misleading. No solid references were ever supplied. ChatGPT shows great potential if further developed by experts in specific areas, but for now it is not yet ready for serious application.\",\"PeriodicalId\":501185,\"journal\":{\"name\":\"medRxiv - Otolaryngology\",\"volume\":\"66 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Otolaryngology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2023.12.19.23300189\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Otolaryngology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.12.19.23300189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景介绍ChatGPT 是一款基于人工智能的会话工具,最近已就一系列主题进行了测试。不过,大多数测试都涉及广泛的知识领域。在这里,我们测试了 ChatGPT 有关耳鸣的知识,耳鸣是听力学和耳鼻喉科的一个重要但专业的方面。测试包括评估 ChatGPT 对有关耳鸣的 10 个问题的回答。此外,鉴于该技术发展迅速,我们在 3 个月后重新评估了对同样 10 个问题的回答:在 2023 年 8 月和 2023 年 11 月这两个时间点,我们向 ChatGPT(3.5 免费版)提出了 10 个有关耳鸣的问题。回答的准确性由 6 位专家使用 1 至 5 分的李克特量表进行评分。此外,还计算了每个回复的字数,并特别检查了回复是否提供了参考文献或是否建议咨询专家:大多数 ChatGPT 的回复都被评为满意或更好。不过,我们也发现了一些回复不准确的情况,可能会被认为有些误导。ChatGPT 的答复相当长(平均超过 400 字),而且偶尔会偏离主题。从未提供过可靠的信息来源参考,当特别要求提供参考时,来源也是人为的。大多数答复都建议咨询专家。值得注意的是,3 个月后,回复普遍有所改善:鉴于问题相当具体,ChatGPT 提供了令人惊讶的良好回答。虽然没有发现潜在的有害错误,但有些错误可能会被认为具有一定的误导性。没有提供可靠的参考资料。如果由特定领域的专家对 ChatGPT 进行进一步开发,它将显示出巨大的潜力,但目前它还不能被认真应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
ChatGPT for tinnitus information and support: response accuracy and retest after three months
Background: ChatGPT - a conversational tool based on artificial intelligence - has recently been tested on a range of topics. However most of the testing has involved broad domains of knowledge. Here we test ChatGPT's knowledge of tinnitus, an important but specialized aspect of audiology and otolaryngology. Testing involved evaluating ChatGPT's answers to a defined set of 10 questions on tinnitus. Furthermore, given the technology is advancing quickly, we re-evaluated the responses to the same 10 questions 3 months later. Material and method: ChatGPT (free version 3.5) was asked 10 questions on tinnitus at two points of time - August 2023 and November 2023. The accuracy of the responses was rated by 6 experts using a Likert scale ranging from 1 to 5. The number of words in each response was also counted, and responses were specifically examined for whether references were provided or whether consultation with a specialist was suggested. Results: Most of ChatGPT's responses were rated as satisfactory or better. However, we did detect a few instances where the responses were not accurate and might be considered somewhat misleading. The responses from ChatGPT were quite long (averaging over 400 words) and they occasionally tended to stray off-topic. No solid references to sources of information were ever supplied, and when references were specifically asked for the sources were artificial. For most responses consultation with a specialist was suggested. It is worth noting that after 3 months the responses generally improved. Conclusions: ChatGPT provided surprisingly good responses, given that the questions were quite specific. Although no potentially harmful errors were identified, some mistakes could be seen as somewhat misleading. No solid references were ever supplied. ChatGPT shows great potential if further developed by experts in specific areas, but for now it is not yet ready for serious application.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信