基于患者和临床的前列腺癌放疗患者教育大语言模型评估。

IF 2.7 3区 医学 Q3 ONCOLOGY
Christian Trapp, Nina Schmidt-Hegemann, Michael Keilholz, Sarah Frederike Brose, Sebastian N Marschner, Stephan Schönecker, Sebastian H Maier, Diana-Coralia Dehelean, Maya Rottler, Dinah Konnerth, Claus Belka, Stefanie Corradini, Paul Rogowski
{"title":"基于患者和临床的前列腺癌放疗患者教育大语言模型评估。","authors":"Christian Trapp, Nina Schmidt-Hegemann, Michael Keilholz, Sarah Frederike Brose, Sebastian N Marschner, Stephan Schönecker, Sebastian H Maier, Diana-Coralia Dehelean, Maya Rottler, Dinah Konnerth, Claus Belka, Stefanie Corradini, Paul Rogowski","doi":"10.1007/s00066-024-02342-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>This study aims to evaluate the capabilities and limitations of large language models (LLMs) for providing patient education for men undergoing radiotherapy for localized prostate cancer, incorporating assessments from both clinicians and patients.</p><p><strong>Methods: </strong>Six questions about definitive radiotherapy for prostate cancer were designed based on common patient inquiries. These questions were presented to different LLMs [ChatGPT‑4, ChatGPT-4o (both OpenAI Inc., San Francisco, CA, USA), Gemini (Google LLC, Mountain View, CA, USA), Copilot (Microsoft Corp., Redmond, WA, USA), and Claude (Anthropic PBC, San Francisco, CA, USA)] via the respective web interfaces. Responses were evaluated for readability using the Flesch Reading Ease Index. Five radiation oncologists assessed the responses for relevance, correctness, and completeness using a five-point Likert scale. Additionally, 35 prostate cancer patients evaluated the responses from ChatGPT‑4 for comprehensibility, accuracy, relevance, trustworthiness, and overall informativeness.</p><p><strong>Results: </strong>The Flesch Reading Ease Index indicated that the responses from all LLMs were relatively difficult to understand. All LLMs provided answers that clinicians found to be generally relevant and correct. The answers from ChatGPT‑4, ChatGPT-4o, and Claude AI were also found to be complete. However, we found significant differences between the performance of different LLMs regarding relevance and completeness. Some answers lacked detail or contained inaccuracies. Patients perceived the information as easy to understand and relevant, with most expressing confidence in the information and a willingness to use ChatGPT‑4 for future medical questions. ChatGPT-4's responses helped patients feel better informed, despite the initially standardized information provided.</p><p><strong>Conclusion: </strong>Overall, LLMs show promise as a tool for patient education in prostate cancer radiotherapy. While improvements are needed in terms of accuracy and readability, positive feedback from clinicians and patients suggests that LLMs can enhance patient understanding and engagement. Further research is essential to fully realize the potential of artificial intelligence in patient education.</p>","PeriodicalId":21998,"journal":{"name":"Strahlentherapie und Onkologie","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Patient- and clinician-based evaluation of large language models for patient education in prostate cancer radiotherapy.\",\"authors\":\"Christian Trapp, Nina Schmidt-Hegemann, Michael Keilholz, Sarah Frederike Brose, Sebastian N Marschner, Stephan Schönecker, Sebastian H Maier, Diana-Coralia Dehelean, Maya Rottler, Dinah Konnerth, Claus Belka, Stefanie Corradini, Paul Rogowski\",\"doi\":\"10.1007/s00066-024-02342-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>This study aims to evaluate the capabilities and limitations of large language models (LLMs) for providing patient education for men undergoing radiotherapy for localized prostate cancer, incorporating assessments from both clinicians and patients.</p><p><strong>Methods: </strong>Six questions about definitive radiotherapy for prostate cancer were designed based on common patient inquiries. These questions were presented to different LLMs [ChatGPT‑4, ChatGPT-4o (both OpenAI Inc., San Francisco, CA, USA), Gemini (Google LLC, Mountain View, CA, USA), Copilot (Microsoft Corp., Redmond, WA, USA), and Claude (Anthropic PBC, San Francisco, CA, USA)] via the respective web interfaces. Responses were evaluated for readability using the Flesch Reading Ease Index. Five radiation oncologists assessed the responses for relevance, correctness, and completeness using a five-point Likert scale. Additionally, 35 prostate cancer patients evaluated the responses from ChatGPT‑4 for comprehensibility, accuracy, relevance, trustworthiness, and overall informativeness.</p><p><strong>Results: </strong>The Flesch Reading Ease Index indicated that the responses from all LLMs were relatively difficult to understand. All LLMs provided answers that clinicians found to be generally relevant and correct. The answers from ChatGPT‑4, ChatGPT-4o, and Claude AI were also found to be complete. However, we found significant differences between the performance of different LLMs regarding relevance and completeness. Some answers lacked detail or contained inaccuracies. Patients perceived the information as easy to understand and relevant, with most expressing confidence in the information and a willingness to use ChatGPT‑4 for future medical questions. ChatGPT-4's responses helped patients feel better informed, despite the initially standardized information provided.</p><p><strong>Conclusion: </strong>Overall, LLMs show promise as a tool for patient education in prostate cancer radiotherapy. While improvements are needed in terms of accuracy and readability, positive feedback from clinicians and patients suggests that LLMs can enhance patient understanding and engagement. Further research is essential to fully realize the potential of artificial intelligence in patient education.</p>\",\"PeriodicalId\":21998,\"journal\":{\"name\":\"Strahlentherapie und Onkologie\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Strahlentherapie und Onkologie\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00066-024-02342-3\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Strahlentherapie und Onkologie","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00066-024-02342-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:本研究旨在评估大型语言模型(LLMs)为接受局限性前列腺癌放疗的男性患者提供患者教育的能力和局限性,并结合临床医生和患者的评估。方法:根据常见的患者问询,设计前列腺癌明确放疗的6个问题。这些问题通过各自的网络界面呈现给不同的法学硕士[ChatGPT -4、ChatGPT- 40 (OpenAI Inc., San Francisco, CA, USA)、Gemini(谷歌LLC, Mountain View, CA, USA)、Copilot (Microsoft Corp., Redmond, WA, USA)和Claude (Anthropic PBC, San Francisco, CA, USA)]。使用Flesch Reading Ease Index评估回复的可读性。5名放射肿瘤学家使用李克特5分量表评估反应的相关性、正确性和完整性。此外,35名前列腺癌患者评估了ChatGPT‑4的可理解性、准确性、相关性、可信度和总体信息量。结果:阅读难度指数显示,所有法学硕士的回答都比较难理解。所有法学硕士都提供了临床医生认为普遍相关且正确的答案。ChatGPT-4、ChatGPT- 40和Claude AI的答案也被发现是完整的。然而,我们发现不同llm在相关性和完整性方面的表现存在显著差异。一些答案缺乏细节或包含不准确之处。患者认为这些信息易于理解且具有相关性,大多数患者表示对这些信息有信心,并愿意在未来的医疗问题中使用ChatGPT - 4。ChatGPT-4的反应帮助患者更好地了解情况,尽管最初提供的信息是标准化的。结论:总体而言,LLMs有望成为前列腺癌放疗患者教育的工具。虽然在准确性和可读性方面需要改进,但临床医生和患者的积极反馈表明llm可以增强患者的理解和参与。为了充分发挥人工智能在患者教育中的潜力,进一步的研究是必不可少的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Patient- and clinician-based evaluation of large language models for patient education in prostate cancer radiotherapy.

Background: This study aims to evaluate the capabilities and limitations of large language models (LLMs) for providing patient education for men undergoing radiotherapy for localized prostate cancer, incorporating assessments from both clinicians and patients.

Methods: Six questions about definitive radiotherapy for prostate cancer were designed based on common patient inquiries. These questions were presented to different LLMs [ChatGPT‑4, ChatGPT-4o (both OpenAI Inc., San Francisco, CA, USA), Gemini (Google LLC, Mountain View, CA, USA), Copilot (Microsoft Corp., Redmond, WA, USA), and Claude (Anthropic PBC, San Francisco, CA, USA)] via the respective web interfaces. Responses were evaluated for readability using the Flesch Reading Ease Index. Five radiation oncologists assessed the responses for relevance, correctness, and completeness using a five-point Likert scale. Additionally, 35 prostate cancer patients evaluated the responses from ChatGPT‑4 for comprehensibility, accuracy, relevance, trustworthiness, and overall informativeness.

Results: The Flesch Reading Ease Index indicated that the responses from all LLMs were relatively difficult to understand. All LLMs provided answers that clinicians found to be generally relevant and correct. The answers from ChatGPT‑4, ChatGPT-4o, and Claude AI were also found to be complete. However, we found significant differences between the performance of different LLMs regarding relevance and completeness. Some answers lacked detail or contained inaccuracies. Patients perceived the information as easy to understand and relevant, with most expressing confidence in the information and a willingness to use ChatGPT‑4 for future medical questions. ChatGPT-4's responses helped patients feel better informed, despite the initially standardized information provided.

Conclusion: Overall, LLMs show promise as a tool for patient education in prostate cancer radiotherapy. While improvements are needed in terms of accuracy and readability, positive feedback from clinicians and patients suggests that LLMs can enhance patient understanding and engagement. Further research is essential to fully realize the potential of artificial intelligence in patient education.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.70
自引率
12.90%
发文量
141
审稿时长
3-8 weeks
期刊介绍: Strahlentherapie und Onkologie, published monthly, is a scientific journal that covers all aspects of oncology with focus on radiooncology, radiation biology and radiation physics. The articles are not only of interest to radiooncologists but to all physicians interested in oncology, to radiation biologists and radiation physicists. The journal publishes original articles, review articles and case studies that are peer-reviewed. It includes scientific short communications as well as a literature review with annotated articles that inform the reader on new developments in the various disciplines concerned and hence allow for a sound overview on the latest results in radiooncology research. Founded in 1912, Strahlentherapie und Onkologie is the oldest oncological journal in the world. Today, contributions are published in English and German. All articles have English summaries and legends. The journal is the official publication of several scientific radiooncological societies and publishes the relevant communications of these societies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信