Assessing accuracy of ChatGPT in response to questions from day to day pharmaceutical care in hospitals

IF 1.8 Q3 PHARMACOLOGY & PHARMACY
Merel van Nuland , Anne-Fleur H. Lobbezoo , Ewoudt M.W. van de Garde , Maikel Herbrink , Inger van Heijl , Tim Bognàr , Jeroen P.A. Houwen , Marloes Dekens , Demi Wannet , Toine Egberts , Paul D. van der Linden
{"title":"Assessing accuracy of ChatGPT in response to questions from day to day pharmaceutical care in hospitals","authors":"Merel van Nuland ,&nbsp;Anne-Fleur H. Lobbezoo ,&nbsp;Ewoudt M.W. van de Garde ,&nbsp;Maikel Herbrink ,&nbsp;Inger van Heijl ,&nbsp;Tim Bognàr ,&nbsp;Jeroen P.A. Houwen ,&nbsp;Marloes Dekens ,&nbsp;Demi Wannet ,&nbsp;Toine Egberts ,&nbsp;Paul D. van der Linden","doi":"10.1016/j.rcsop.2024.100464","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>The advent of Large Language Models (LLMs) such as ChatGPT introduces opportunities within the medical field. Nonetheless, use of LLM poses a risk when healthcare practitioners and patients present clinical questions to these programs without a comprehensive understanding of its suitability for clinical contexts.</p></div><div><h3>Objective</h3><p>The objective of this study was to assess ChatGPT's ability to generate appropriate responses to clinical questions that hospital pharmacists could encounter during routine patient care.</p></div><div><h3>Methods</h3><p>Thirty questions from 10 different domains within clinical pharmacy were collected during routine care. Questions were presented to ChatGPT in a standardized format, including patients' age, sex, drug name, dose, and indication. Subsequently, relevant information regarding specific cases were provided, and the prompt was concluded with the query “what would a hospital pharmacist do?”. The impact on accuracy was assessed for each domain by modifying personification to “what would you do?”, presenting the question in Dutch, and regenerating the primary question. All responses were independently evaluated by two senior hospital pharmacists, focusing on the availability of an advice, accuracy and concordance.</p></div><div><h3>Results</h3><p>In 77% of questions, ChatGPT provided an advice in response to the question. For these responses, accuracy and concordance were determined. Accuracy was correct and complete for 26% of responses, correct but incomplete for 22% of responses, partially correct and partially incorrect for 30% of responses and completely incorrect for 22% of responses. The reproducibility was poor, with merely 10% of responses remaining consistent upon regeneration of the primary question.</p></div><div><h3>Conclusions</h3><p>While concordance of responses was excellent, the accuracy and reproducibility were poor. With the described method, ChatGPT should not be used to address questions encountered by hospital pharmacists during their shifts. However, it is important to acknowledge the limitations of our methodology, including potential biases, which may have influenced the findings.</p></div>","PeriodicalId":73003,"journal":{"name":"Exploratory research in clinical and social pharmacy","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667276624000611/pdfft?md5=7dba765dfd1e9f2fac71ba4ccdc63981&pid=1-s2.0-S2667276624000611-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Exploratory research in clinical and social pharmacy","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667276624000611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

Abstract

Background

The advent of Large Language Models (LLMs) such as ChatGPT introduces opportunities within the medical field. Nonetheless, use of LLM poses a risk when healthcare practitioners and patients present clinical questions to these programs without a comprehensive understanding of its suitability for clinical contexts.

Objective

The objective of this study was to assess ChatGPT's ability to generate appropriate responses to clinical questions that hospital pharmacists could encounter during routine patient care.

Methods

Thirty questions from 10 different domains within clinical pharmacy were collected during routine care. Questions were presented to ChatGPT in a standardized format, including patients' age, sex, drug name, dose, and indication. Subsequently, relevant information regarding specific cases were provided, and the prompt was concluded with the query “what would a hospital pharmacist do?”. The impact on accuracy was assessed for each domain by modifying personification to “what would you do?”, presenting the question in Dutch, and regenerating the primary question. All responses were independently evaluated by two senior hospital pharmacists, focusing on the availability of an advice, accuracy and concordance.

Results

In 77% of questions, ChatGPT provided an advice in response to the question. For these responses, accuracy and concordance were determined. Accuracy was correct and complete for 26% of responses, correct but incomplete for 22% of responses, partially correct and partially incorrect for 30% of responses and completely incorrect for 22% of responses. The reproducibility was poor, with merely 10% of responses remaining consistent upon regeneration of the primary question.

Conclusions

While concordance of responses was excellent, the accuracy and reproducibility were poor. With the described method, ChatGPT should not be used to address questions encountered by hospital pharmacists during their shifts. However, it is important to acknowledge the limitations of our methodology, including potential biases, which may have influenced the findings.

评估 ChatGPT 回答医院日常药物护理问题的准确性
背景大语言模型(LLM)(如 ChatGPT)的出现为医疗领域带来了机遇。本研究的目的是评估 ChatGPT 对医院药剂师在日常患者护理过程中可能遇到的临床问题生成适当回复的能力。方法在日常护理过程中收集了来自临床药学 10 个不同领域的 30 个问题。问题以标准化格式呈现给 ChatGPT,包括患者的年龄、性别、药物名称、剂量和适应症。随后,提供具体病例的相关信息,并以 "医院药剂师会怎么做?"的询问结束提示。通过将拟人化修改为 "您会怎么做?"、用荷兰语提出问题并重新生成主问题,对每个领域的准确性影响进行了评估。所有回答均由两名资深医院药剂师进行独立评估,重点关注建议的可用性、准确性和一致性。结果在 77% 的问题中,ChatGPT 针对问题提供了建议。对于这些回答,确定了准确性和一致性。准确性方面,26% 的回答正确且完整,22% 的回答正确但不完整,30% 的回答部分正确和部分不正确,22% 的回答完全不正确。重现性很差,只有 10% 的回答在主问题重新生成后保持一致。根据所描述的方法,ChatGPT 不应用于解决医院药剂师在工作中遇到的问题。但是,必须承认我们的方法存在局限性,包括可能影响研究结果的潜在偏见。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
审稿时长
103 days
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信