全科医生还是 ChatGPT？大型语言模型 (LLM) 在为全科医生开抗生素处方时提供支持的能力。

IF 3.9 2区医学 Q1 INFECTIOUS DISEASES

Journal of Antimicrobial Chemotherapy Pub Date : 2025-05-02 DOI:10.1093/jac/dkaf077

Oanh Ngoc Nguyen, Doaa Amin, James Bennett, Øystein Hetlevik, Sara Malik, Andrew Tout, Heike Vornhagen, Akke Vellinga

{"title":"全科医生还是 ChatGPT？大型语言模型 (LLM) 在为全科医生开抗生素处方时提供支持的能力。","authors":"Oanh Ngoc Nguyen, Doaa Amin, James Bennett, Øystein Hetlevik, Sara Malik, Andrew Tout, Heike Vornhagen, Akke Vellinga","doi":"10.1093/jac/dkaf077","DOIUrl":null,"url":null,"abstract":"Introduction: Large language models (LLMs) are becoming ubiquitous and widely implemented. LLMs could also be used for diagnosis and treatment. National antibiotic prescribing guidelines are customized and informed by local laboratory data on antimicrobial resistance.Methods: Based on 24 vignettes with information on type of infection, gender, age group and comorbidities, GPs and LLMs were prompted to provide a treatment. Four countries (Ireland, UK, USA and Norway) were included and a GP from each country and six LLMs (ChatGPT, Gemini, Copilot, Mistral AI, Claude and Llama 3.1) were provided with the vignettes, including their location (country). Responses were compared with the country's national prescribing guidelines. In addition, limitations of LLMs such as hallucination, toxicity and data leakage were assessed.Results: GPs' answers to the vignettes showed high accuracy in relation to diagnosis (96%-100%) and yes/no antibiotic prescribing (83%-92%). GPs referenced (100%) and prescribed (58%-92%) according to national guidelines, but dose/duration of treatment was less accurate (50%-75%). Overall, the GPs' accuracy had a mean of 74%. LLMs scored high in relation to diagnosis (92%-100%), antibiotic prescribing (88%-100%) and the choice of antibiotic (59%-100%) but correct referencing often failed (38%-96%), in particular for the Norwegian guidelines (0%-13%). Data leakage was shown to be an issue as personal information was repeated in the models' responses to the vignettes.Conclusions: LLMs may be safe to guide antibiotic prescribing in general practice. However, to interpret vignettes, apply national guidelines and prescribe the right dose and duration, GPs remain best placed.","PeriodicalId":14969,"journal":{"name":"Journal of Antimicrobial Chemotherapy","volume":" ","pages":"1324-1330"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12046391/pdf/","citationCount":"0","resultStr":"{\"title\":\"GP or ChatGPT? Ability of large language models (LLMs) to support general practitioners when prescribing antibiotics.\",\"authors\":\"Oanh Ngoc Nguyen, Doaa Amin, James Bennett, Øystein Hetlevik, Sara Malik, Andrew Tout, Heike Vornhagen, Akke Vellinga\",\"doi\":\"10.1093/jac/dkaf077\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Large language models (LLMs) are becoming ubiquitous and widely implemented. LLMs could also be used for diagnosis and treatment. National antibiotic prescribing guidelines are customized and informed by local laboratory data on antimicrobial resistance.Methods: Based on 24 vignettes with information on type of infection, gender, age group and comorbidities, GPs and LLMs were prompted to provide a treatment. Four countries (Ireland, UK, USA and Norway) were included and a GP from each country and six LLMs (ChatGPT, Gemini, Copilot, Mistral AI, Claude and Llama 3.1) were provided with the vignettes, including their location (country). Responses were compared with the country's national prescribing guidelines. In addition, limitations of LLMs such as hallucination, toxicity and data leakage were assessed.Results: GPs' answers to the vignettes showed high accuracy in relation to diagnosis (96%-100%) and yes/no antibiotic prescribing (83%-92%). GPs referenced (100%) and prescribed (58%-92%) according to national guidelines, but dose/duration of treatment was less accurate (50%-75%). Overall, the GPs' accuracy had a mean of 74%. LLMs scored high in relation to diagnosis (92%-100%), antibiotic prescribing (88%-100%) and the choice of antibiotic (59%-100%) but correct referencing often failed (38%-96%), in particular for the Norwegian guidelines (0%-13%). Data leakage was shown to be an issue as personal information was repeated in the models' responses to the vignettes.Conclusions: LLMs may be safe to guide antibiotic prescribing in general practice. However, to interpret vignettes, apply national guidelines and prescribe the right dose and duration, GPs remain best placed.\",\"PeriodicalId\":14969,\"journal\":{\"name\":\"Journal of Antimicrobial Chemotherapy\",\"volume\":\" \",\"pages\":\"1324-1330\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12046391/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Antimicrobial Chemotherapy\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/jac/dkaf077\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Antimicrobial Chemotherapy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/jac/dkaf077","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}

引用次数: 0

摘要

大型语言模型（llm）正变得无处不在并被广泛实现。llm也可用于诊断和治疗。国家抗生素处方指南是根据当地抗微生物药物耐药性实验室数据定制的。方法：基于24个包含感染类型、性别、年龄组和合并症信息的小片段，提示全科医生和llm提供治疗。包括四个国家（爱尔兰，英国，美国和挪威），每个国家的GP和六位llm （ChatGPT, Gemini, Copilot, Mistral AI， Claude和Llama 3.1）提供了包括其位置（国家）的小插图。这些回应与该国的国家处方指南进行了比较。此外，还评估了llm的局限性，如幻觉、毒性和数据泄露。结果：全科医生对小短片的回答与诊断（96%-100%）和是/否抗生素处方（83%-92%）相关的准确性较高。根据国家指南参考全科医生（100%）和处方（58%-92%），但剂量/治疗持续时间的准确性较低（50%-75%）。总体而言，全科医生的准确率平均为74%。llm在诊断（92%-100%）、抗生素处方（88%-100%）和抗生素选择（59%-100%）方面得分很高，但正确参考往往失败（38%-96%），特别是挪威指南（0%-13%）。数据泄露被证明是一个问题，因为个人信息在模型对小插曲的回应中被重复。结论：llm在一般实践中可以安全指导抗生素处方。然而，在解释小插曲、应用国家指南和开出正确的剂量和持续时间方面，全科医生仍然处于最佳位置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GP or ChatGPT? Ability of large language models (LLMs) to support general practitioners when prescribing antibiotics.

Introduction: Large language models (LLMs) are becoming ubiquitous and widely implemented. LLMs could also be used for diagnosis and treatment. National antibiotic prescribing guidelines are customized and informed by local laboratory data on antimicrobial resistance.

Methods: Based on 24 vignettes with information on type of infection, gender, age group and comorbidities, GPs and LLMs were prompted to provide a treatment. Four countries (Ireland, UK, USA and Norway) were included and a GP from each country and six LLMs (ChatGPT, Gemini, Copilot, Mistral AI, Claude and Llama 3.1) were provided with the vignettes, including their location (country). Responses were compared with the country's national prescribing guidelines. In addition, limitations of LLMs such as hallucination, toxicity and data leakage were assessed.

Results: GPs' answers to the vignettes showed high accuracy in relation to diagnosis (96%-100%) and yes/no antibiotic prescribing (83%-92%). GPs referenced (100%) and prescribed (58%-92%) according to national guidelines, but dose/duration of treatment was less accurate (50%-75%). Overall, the GPs' accuracy had a mean of 74%. LLMs scored high in relation to diagnosis (92%-100%), antibiotic prescribing (88%-100%) and the choice of antibiotic (59%-100%) but correct referencing often failed (38%-96%), in particular for the Norwegian guidelines (0%-13%). Data leakage was shown to be an issue as personal information was repeated in the models' responses to the vignettes.

Conclusions: LLMs may be safe to guide antibiotic prescribing in general practice. However, to interpret vignettes, apply national guidelines and prescribe the right dose and duration, GPs remain best placed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Antimicrobial Chemotherapy 医学-传染病学

CiteScore

9.20

自引率

5.80%

发文量

423

审稿时长

2-4 weeks

期刊介绍： The Journal publishes articles that further knowledge and advance the science and application of antimicrobial chemotherapy with antibiotics and antifungal, antiviral and antiprotozoal agents. The Journal publishes primarily in human medicine, and articles in veterinary medicine likely to have an impact on global health.