将 ChatGPT-4 和 Google Gard 作为癌症患者最常用放射性核素治疗的患者信息来源的可靠性和可讲性分析。

IF 1.6 4区 医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
H. Şan , Ö. Bayrakçi , B. Çağdaş , M. Serdengeçti , E. Alagöz
{"title":"将 ChatGPT-4 和 Google Gard 作为癌症患者最常用放射性核素治疗的患者信息来源的可靠性和可讲性分析。","authors":"H. Şan ,&nbsp;Ö. Bayrakçi ,&nbsp;B. Çağdaş ,&nbsp;M. Serdengeçti ,&nbsp;E. Alagöz","doi":"10.1016/j.remn.2024.500021","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>Searching for online health information is a popular approach employed by patients to enhance their knowledge for their diseases. Recently developed AI chatbots are probably the easiest way in this regard. The purpose of the study is to analyze the reliability and readability of AI chatbot responses in terms of the most commonly applied radionuclide treatments in cancer patients.</p></div><div><h3>Methods</h3><p>Basic patient questions, thirty about RAI, PRRT and TARE treatments and twenty-nine about PSMA-TRT, were asked one by one to GPT-4 and Bard on January 2024. The reliability and readability of the responses were assessed by using DISCERN scale, Flesch Reading Ease(FRE) and Flesch-Kincaid Reading Grade Level(FKRGL).</p></div><div><h3>Results</h3><p>The mean (SD) FKRGL scores for the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 14.57 (1.19), 14.65 (1.38), 14.25 (1.10), 14.38 (1.2) and 11.49 (1.59), 12.42 (1.71), 11.35 (1.80), 13.01 (1.97), respectively. In terms of readability the FRKGL scores of the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT and TARE treatments were above the general public reading grade level. The mean (SD) DISCERN scores assesses by nuclear medicine phsician for the responses of GPT-4 and Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 47.86 (5.09), 48.48 (4.22), 46.76 (4.09), 48.33 (5.15) and 51.50 (5.64), 53.44 (5.42), 53 (6.36), 49.43 (5.32), respectively. Based on mean DISCERN scores, the reliability of the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT, and TARE treatments ranged from fair to good. The inter-rater reliability correlation coefficient of DISCERN scores assessed by GPT-4, Bard and nuclear medicine physician for the responses of GPT-4 about RAI, PSMA-TRT, PRRT and TARE treatments were 0.512 (95% CI 0.296: 0.704), 0.695 (95% CI 0.518: 0.829), 0.687 (95% CI 0.511: 0.823) and 0.649 (95% CI 0.462: 0.798), respectively (<em>P</em>&lt;.01). The inter-rater reliability correlation coefficient of DISCERN scores assessed by GPT-4, Bard and nuclear medicine physician for the responses of Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 0.753 (95% CI 0.602: 0.863), 0.812 (95% CI 0.686: 0.899), 0.804 (95% CI 0.677: 0.894) and 0.671 (95% CI 0.489: 0.812), respectively (<em>P</em>&lt;.01). The inter-rater reliability for the responses of Bard and GPT-4 about RAİ, PSMA-TRT, PRRT and TARE treatments were moderate to good. Further, consulting to the nuclear medicine physician was rarely emphasized both in GPT-4 and Google Bard and references were included in some responses of Google Bard, but there were no references in GPT-4.</p></div><div><h3>Conclusion</h3><p>Although the information provided by AI chatbots may be acceptable in medical terms, it can not be easy to read for the general public, which may prevent it from being understandable. Effective prompts using ‘prompt engineering’ may refine the responses in a more comprehensible manner. Since radionuclide treatments are specific to nuclear medicine expertise, nuclear medicine physician need to be stated as a consultant in responses in order to guide patients and caregivers to obtain accurate medical advice. Referencing is significant in terms of confidence and satisfaction of patients and caregivers seeking information.</p></div>","PeriodicalId":48986,"journal":{"name":"Revista Espanola De Medicina Nuclear E Imagen Molecular","volume":"43 4","pages":"Article 500021"},"PeriodicalIF":1.6000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Análisis de confiabilidad y lectibilidad de ChatGPT-4 y Google Gard como fuente de información del paciente para los tratamientos con radionúclidos más comúnmente aplicados en pacientes con cáncer\",\"authors\":\"H. Şan ,&nbsp;Ö. Bayrakçi ,&nbsp;B. Çağdaş ,&nbsp;M. Serdengeçti ,&nbsp;E. Alagöz\",\"doi\":\"10.1016/j.remn.2024.500021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose</h3><p>Searching for online health information is a popular approach employed by patients to enhance their knowledge for their diseases. Recently developed AI chatbots are probably the easiest way in this regard. The purpose of the study is to analyze the reliability and readability of AI chatbot responses in terms of the most commonly applied radionuclide treatments in cancer patients.</p></div><div><h3>Methods</h3><p>Basic patient questions, thirty about RAI, PRRT and TARE treatments and twenty-nine about PSMA-TRT, were asked one by one to GPT-4 and Bard on January 2024. The reliability and readability of the responses were assessed by using DISCERN scale, Flesch Reading Ease(FRE) and Flesch-Kincaid Reading Grade Level(FKRGL).</p></div><div><h3>Results</h3><p>The mean (SD) FKRGL scores for the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 14.57 (1.19), 14.65 (1.38), 14.25 (1.10), 14.38 (1.2) and 11.49 (1.59), 12.42 (1.71), 11.35 (1.80), 13.01 (1.97), respectively. In terms of readability the FRKGL scores of the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT and TARE treatments were above the general public reading grade level. The mean (SD) DISCERN scores assesses by nuclear medicine phsician for the responses of GPT-4 and Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 47.86 (5.09), 48.48 (4.22), 46.76 (4.09), 48.33 (5.15) and 51.50 (5.64), 53.44 (5.42), 53 (6.36), 49.43 (5.32), respectively. Based on mean DISCERN scores, the reliability of the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT, and TARE treatments ranged from fair to good. The inter-rater reliability correlation coefficient of DISCERN scores assessed by GPT-4, Bard and nuclear medicine physician for the responses of GPT-4 about RAI, PSMA-TRT, PRRT and TARE treatments were 0.512 (95% CI 0.296: 0.704), 0.695 (95% CI 0.518: 0.829), 0.687 (95% CI 0.511: 0.823) and 0.649 (95% CI 0.462: 0.798), respectively (<em>P</em>&lt;.01). The inter-rater reliability correlation coefficient of DISCERN scores assessed by GPT-4, Bard and nuclear medicine physician for the responses of Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 0.753 (95% CI 0.602: 0.863), 0.812 (95% CI 0.686: 0.899), 0.804 (95% CI 0.677: 0.894) and 0.671 (95% CI 0.489: 0.812), respectively (<em>P</em>&lt;.01). The inter-rater reliability for the responses of Bard and GPT-4 about RAİ, PSMA-TRT, PRRT and TARE treatments were moderate to good. Further, consulting to the nuclear medicine physician was rarely emphasized both in GPT-4 and Google Bard and references were included in some responses of Google Bard, but there were no references in GPT-4.</p></div><div><h3>Conclusion</h3><p>Although the information provided by AI chatbots may be acceptable in medical terms, it can not be easy to read for the general public, which may prevent it from being understandable. Effective prompts using ‘prompt engineering’ may refine the responses in a more comprehensible manner. Since radionuclide treatments are specific to nuclear medicine expertise, nuclear medicine physician need to be stated as a consultant in responses in order to guide patients and caregivers to obtain accurate medical advice. Referencing is significant in terms of confidence and satisfaction of patients and caregivers seeking information.</p></div>\",\"PeriodicalId\":48986,\"journal\":{\"name\":\"Revista Espanola De Medicina Nuclear E Imagen Molecular\",\"volume\":\"43 4\",\"pages\":\"Article 500021\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista Espanola De Medicina Nuclear E Imagen Molecular\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2253654X24000295\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Espanola De Medicina Nuclear E Imagen Molecular","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2253654X24000295","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

摘要

目的搜索在线健康信息是患者常用的一种方法,以增强他们对疾病的了解。最近开发的人工智能聊天机器人可能是这方面最简单的方法。本研究的目的是分析人工智能聊天机器人就癌症患者最常使用的放射性核素治疗方法所做回答的可靠性和可读性。方法在 2024 年 1 月向 GPT-4 和 Bard 逐一询问了患者的基本问题,其中 30 个是关于 RAI、PRRT 和 TARE 治疗的,29 个是关于 PSMA-TRT 的。采用 DISCERN 量表、Flesch Reading Ease(FRE)和 Flesch-Kincaid Reading Grade Level(FKRGL)对回答的可靠性和可读性进行了评估。结果 GPT-4 和 Google Bard 中关于 RAI、PSMA-TRT、PRRT 和 TARE 治疗的回答的 FKRGL 平均得分(标清)分别为 14.57(1.19)、14.65(1.38)、14.25(1.10)、14.38(1.2)和 11.49(1.59)、12.42(1.71)、11.35(1.80)、13.01(1.97)。就可读性而言,关于 RAI、PSMA-TRT、PRRT 和 TARE 治疗的 GPT-4 和 Google Bard 的 FRKGL 分数高于一般公众的阅读水平。核医学医生对 GPT-4 和谷歌巴德关于 RAI、PSMA-TRT、PRRT 和 TARE 治疗的回答进行评估后得出的 DISCERN 平均分(标度)分别为 47.86(5.09)、48.48(4.22)、46.76(4.09)、48.33(5.15)和 51.50(5.64)、53.44(5.42)、53(6.36)、49.43(5.32)。根据 DISCERN 平均得分,GPT-4 和 Google Bard 关于 RAI、PSMA-TRT、PRRT 和 TARE 治疗的回答的可靠性从一般到良好不等。由 GPT-4、谷歌巴德和核医学医生对 GPT-4 关于 RAI、PSMA-TRT、PRRT 和 TARE 治疗的回答所评估的 DISCERN 分数的评分者间可靠性相关系数为 0.512(95% CI 0.296:0.704)、0.695(95% CI 0.518:0.829)、0.687(95% CI 0.511:0.823)和 0.649(95% CI 0.462:0.798)(P<.01)。由 GPT-4、Bard 和核医学医生评估的 Bard 关于 RAI、PSMA-TRT、PRRT 和 TARE 治疗的 DISCERN 评分的评分者间可靠性相关系数分别为 0.753(95% CI 0.602:0.863)、0.812(95% CI 0.686:0.899)、0.804(95% CI 0.677:0.894)和 0.671(95% CI 0.489:0.812)(P<.01)。Bard和GPT-4对RAİ、PSMA-TRT、PRRT和TARE治疗的反应的评分者间可靠性为中等至良好。此外,GPT-4 和 Google Bard 很少强调向核医学医生咨询,Google Bard 的一些回答中包含了参考文献,但 GPT-4 中没有参考文献。使用 "提示工程 "进行有效提示可能会以更易于理解的方式完善回复。由于放射性核素治疗是核医学专业知识的特定内容,因此需要在回答中说明核医学医生是顾问,以指导患者和护理人员获得准确的医疗建议。就患者和护理人员寻求信息的信心和满意度而言,参考意义重大。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Análisis de confiabilidad y lectibilidad de ChatGPT-4 y Google Gard como fuente de información del paciente para los tratamientos con radionúclidos más comúnmente aplicados en pacientes con cáncer

Purpose

Searching for online health information is a popular approach employed by patients to enhance their knowledge for their diseases. Recently developed AI chatbots are probably the easiest way in this regard. The purpose of the study is to analyze the reliability and readability of AI chatbot responses in terms of the most commonly applied radionuclide treatments in cancer patients.

Methods

Basic patient questions, thirty about RAI, PRRT and TARE treatments and twenty-nine about PSMA-TRT, were asked one by one to GPT-4 and Bard on January 2024. The reliability and readability of the responses were assessed by using DISCERN scale, Flesch Reading Ease(FRE) and Flesch-Kincaid Reading Grade Level(FKRGL).

Results

The mean (SD) FKRGL scores for the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 14.57 (1.19), 14.65 (1.38), 14.25 (1.10), 14.38 (1.2) and 11.49 (1.59), 12.42 (1.71), 11.35 (1.80), 13.01 (1.97), respectively. In terms of readability the FRKGL scores of the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT and TARE treatments were above the general public reading grade level. The mean (SD) DISCERN scores assesses by nuclear medicine phsician for the responses of GPT-4 and Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 47.86 (5.09), 48.48 (4.22), 46.76 (4.09), 48.33 (5.15) and 51.50 (5.64), 53.44 (5.42), 53 (6.36), 49.43 (5.32), respectively. Based on mean DISCERN scores, the reliability of the responses of GPT-4 and Google Bard about RAI, PSMA-TRT, PRRT, and TARE treatments ranged from fair to good. The inter-rater reliability correlation coefficient of DISCERN scores assessed by GPT-4, Bard and nuclear medicine physician for the responses of GPT-4 about RAI, PSMA-TRT, PRRT and TARE treatments were 0.512 (95% CI 0.296: 0.704), 0.695 (95% CI 0.518: 0.829), 0.687 (95% CI 0.511: 0.823) and 0.649 (95% CI 0.462: 0.798), respectively (P<.01). The inter-rater reliability correlation coefficient of DISCERN scores assessed by GPT-4, Bard and nuclear medicine physician for the responses of Bard about RAI, PSMA-TRT, PRRT and TARE treatments were 0.753 (95% CI 0.602: 0.863), 0.812 (95% CI 0.686: 0.899), 0.804 (95% CI 0.677: 0.894) and 0.671 (95% CI 0.489: 0.812), respectively (P<.01). The inter-rater reliability for the responses of Bard and GPT-4 about RAİ, PSMA-TRT, PRRT and TARE treatments were moderate to good. Further, consulting to the nuclear medicine physician was rarely emphasized both in GPT-4 and Google Bard and references were included in some responses of Google Bard, but there were no references in GPT-4.

Conclusion

Although the information provided by AI chatbots may be acceptable in medical terms, it can not be easy to read for the general public, which may prevent it from being understandable. Effective prompts using ‘prompt engineering’ may refine the responses in a more comprehensible manner. Since radionuclide treatments are specific to nuclear medicine expertise, nuclear medicine physician need to be stated as a consultant in responses in order to guide patients and caregivers to obtain accurate medical advice. Referencing is significant in terms of confidence and satisfaction of patients and caregivers seeking information.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Revista Espanola De Medicina Nuclear E Imagen Molecular
Revista Espanola De Medicina Nuclear E Imagen Molecular RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
1.10
自引率
16.70%
发文量
85
审稿时长
24 days
期刊介绍: The Revista Española de Medicina Nuclear e Imagen Molecular (Spanish Journal of Nuclear Medicine and Molecular Imaging), was founded in 1982, and is the official journal of the Spanish Society of Nuclear Medicine and Molecular Imaging, which has more than 700 members. The Journal, which publishes 6 regular issues per year, has the promotion of research and continuing education in all fields of Nuclear Medicine as its main aim. For this, its principal sections are Originals, Clinical Notes, Images of Interest, and Special Collaboration articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信