{"title":"An assessment of generative artificial intelligence in responding to clinical queries on tapering antidepressants.","authors":"Maeve Mac Oscar, Miriam Boland, Cathal Cadogan","doi":"10.1016/j.sapharm.2025.06.107","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>A substantial cohort of individuals rely on online resources, such as discussion forums, for support on tapering antidepressants. This study aimed to assess the performance of generative artificial intelligence (AI) in responding to clinical queries on tapering antidepressants.</p><p><strong>Methods: </strong>Ten queries on tapering antidepressants were developed based on previous research, prescribing guidelines, and online peer support forums. Queries covered areas including reasons for discontinuing antidepressants, tapering methods, withdrawal symptoms, and relapse. Each query was submitted to ChatGPT (OpenAI, San Francisco, CA) using the GPT-4 model as an independent standalone query using standardised prompts. Responses were evaluated in terms of relevance, accuracy, completeness, and clarity by two researchers working independently.</p><p><strong>Results: </strong>GPT-4 responses to all tapering queries were considered relevant and within scope. Most responses (8/10) incorporated safety netting by emphasising the importance of consulting healthcare professionals before making any medication changes. The overall accuracy, completeness, and clarity of responses compared less favourably. The response to a query on hyperbolic tapering had the least favourable assessment. This was due to inaccuracies as the response incorrectly referred to logarithmic reductions and provided inaccurate examples of fixed dosage reductions. Several instances of AI hallucinations were identified, including fabricated references.</p><p><strong>Conclusion: </strong>Generative AI is having a transformative impact on healthcare, including how healthcare professionals and patients access information about clinical queries, such as antidepressant tapering. The study findings show that GPT-4 was able to provide relevant and safety-conscious responses on antidepressant tapering. However, performance issues such as inconsistencies and inaccuracies in tapering recommendations highlight the important role that healthcare professionals continue to play in providing patients with clinically trained, professional support in safely managing health-related issues. Further research on developing AI evaluation tools is needed to ensure consistency in the approaches used in evaluating the performance of AI in addressing clinical queries.</p>","PeriodicalId":48126,"journal":{"name":"Research in Social & Administrative Pharmacy","volume":" ","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research in Social & Administrative Pharmacy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.sapharm.2025.06.107","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Background: A substantial cohort of individuals rely on online resources, such as discussion forums, for support on tapering antidepressants. This study aimed to assess the performance of generative artificial intelligence (AI) in responding to clinical queries on tapering antidepressants.
Methods: Ten queries on tapering antidepressants were developed based on previous research, prescribing guidelines, and online peer support forums. Queries covered areas including reasons for discontinuing antidepressants, tapering methods, withdrawal symptoms, and relapse. Each query was submitted to ChatGPT (OpenAI, San Francisco, CA) using the GPT-4 model as an independent standalone query using standardised prompts. Responses were evaluated in terms of relevance, accuracy, completeness, and clarity by two researchers working independently.
Results: GPT-4 responses to all tapering queries were considered relevant and within scope. Most responses (8/10) incorporated safety netting by emphasising the importance of consulting healthcare professionals before making any medication changes. The overall accuracy, completeness, and clarity of responses compared less favourably. The response to a query on hyperbolic tapering had the least favourable assessment. This was due to inaccuracies as the response incorrectly referred to logarithmic reductions and provided inaccurate examples of fixed dosage reductions. Several instances of AI hallucinations were identified, including fabricated references.
Conclusion: Generative AI is having a transformative impact on healthcare, including how healthcare professionals and patients access information about clinical queries, such as antidepressant tapering. The study findings show that GPT-4 was able to provide relevant and safety-conscious responses on antidepressant tapering. However, performance issues such as inconsistencies and inaccuracies in tapering recommendations highlight the important role that healthcare professionals continue to play in providing patients with clinically trained, professional support in safely managing health-related issues. Further research on developing AI evaluation tools is needed to ensure consistency in the approaches used in evaluating the performance of AI in addressing clinical queries.
背景:大量个体依赖在线资源,如论坛,以支持减量抗抑郁药。本研究旨在评估生成式人工智能(AI)在回答有关减量抗抑郁药的临床问题方面的表现。方法:根据以往研究、处方指南和在线同伴支持论坛,编制10个关于减量抗抑郁药的问题。查询涵盖的领域包括停药的原因、减量方法、戒断症状和复发。每个查询都提交给ChatGPT (OpenAI, San Francisco, CA),使用GPT-4模型作为使用标准化提示的独立独立查询。回答的相关性、准确性、完整性和清晰度由两位独立研究人员进行评估。结果:GPT-4对所有锥形查询的响应被认为是相关的,并且在范围内。大多数回应(8/10)通过强调在进行任何药物更改之前咨询医疗保健专业人员的重要性来纳入安全网。相比之下,总体的准确性、完整性和清晰度较差。对关于双曲线缩减的问题的回答得到了最不利的评价。这是由于不准确,因为答复错误地提到了对数减少,并提供了不准确的固定剂量减少的例子。我们发现了几个人工智能幻觉的例子,包括捏造的参考资料。结论:生成式人工智能正在对医疗保健产生变革性影响,包括医疗保健专业人员和患者如何访问有关临床查询的信息,例如抗抑郁药逐渐减少。研究结果表明,GPT-4能够提供相关的和安全意识的抗抑郁药减量反应。然而,减量建议的不一致和不准确等性能问题突出了医疗保健专业人员在为患者提供经过临床培训的专业支持以安全管理健康相关问题方面继续发挥的重要作用。需要进一步研究开发人工智能评估工具,以确保用于评估人工智能在解决临床问题方面的表现的方法的一致性。
期刊介绍:
Research in Social and Administrative Pharmacy (RSAP) is a quarterly publication featuring original scientific reports and comprehensive review articles in the social and administrative pharmaceutical sciences. Topics of interest include outcomes evaluation of products, programs, or services; pharmacoepidemiology; medication adherence; direct-to-consumer advertising of prescription medications; disease state management; health systems reform; drug marketing; medication distribution systems such as e-prescribing; web-based pharmaceutical/medical services; drug commerce and re-importation; and health professions workforce issues.