“大型语言模型与NHS 111在线眼科急诊分诊的比较分析”。

IF 2.8 3区 医学 Q1 OPHTHALMOLOGY
Eye Pub Date : 2025-01-21 DOI:10.1038/s41433-025-03605-8
Shaheryar Ahmed Khan, Chrishan Gunasekera
{"title":"“大型语言模型与NHS 111在线眼科急诊分诊的比较分析”。","authors":"Shaheryar Ahmed Khan, Chrishan Gunasekera","doi":"10.1038/s41433-025-03605-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>This study presents a comprehensive evaluation of the performance of various large language models in generating responses for ophthalmology emergencies and compares their accuracy with the established United Kingdom's National Health Service 111 online system.</p><p><strong>Methods: </strong>We included 21 ophthalmology-related emergency scenario questions from the NHS 111 triaging algorithm. These questions were based on four different ophthalmology emergency themes as laid out in the NHS 111 algorithm. Responses generated from NHS 111 online, were compared to different LLM-chatbots responses to determine the accuracy of LLM responses. We included a range of models including ChatGPT-3.5, Google Bard, Bing Chat, and ChatGPT-4.0. The accuracy of each LLM-chatbot response was compared against the NHS 111 Triage using a two-prompt strategy. Answers were graded as following: -2 graded as \"Very poor\", -1 as \"Poor\", O as \"No response\", 1 as \"Good\", 2 as \"Very good\" and 3 graded as \"Excellent\".</p><p><strong>Results: </strong>Overall LLMs' attained a good accuracy in this study compared against the NHS 111 responses. The score of ≥1 graded as \"Good\" was achieved by 93% responses of all LLMs. This refers to at least part of this answer having correct information as well as absence of any wrong information. There was no marked difference and very similar results seen overall on both prompts.</p><p><strong>Conclusions: </strong>The high accuracy and safety observed in LLM responses support their potential as effective tools for providing timely information and guidance to patients. LLMs hold promise in enhancing patient care and healthcare accessibility in digital age.</p>","PeriodicalId":12125,"journal":{"name":"Eye","volume":" ","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"\\\"Comparative analysis of large language models against the NHS 111 online triaging for emergency ophthalmology\\\".\",\"authors\":\"Shaheryar Ahmed Khan, Chrishan Gunasekera\",\"doi\":\"10.1038/s41433-025-03605-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>This study presents a comprehensive evaluation of the performance of various large language models in generating responses for ophthalmology emergencies and compares their accuracy with the established United Kingdom's National Health Service 111 online system.</p><p><strong>Methods: </strong>We included 21 ophthalmology-related emergency scenario questions from the NHS 111 triaging algorithm. These questions were based on four different ophthalmology emergency themes as laid out in the NHS 111 algorithm. Responses generated from NHS 111 online, were compared to different LLM-chatbots responses to determine the accuracy of LLM responses. We included a range of models including ChatGPT-3.5, Google Bard, Bing Chat, and ChatGPT-4.0. The accuracy of each LLM-chatbot response was compared against the NHS 111 Triage using a two-prompt strategy. Answers were graded as following: -2 graded as \\\"Very poor\\\", -1 as \\\"Poor\\\", O as \\\"No response\\\", 1 as \\\"Good\\\", 2 as \\\"Very good\\\" and 3 graded as \\\"Excellent\\\".</p><p><strong>Results: </strong>Overall LLMs' attained a good accuracy in this study compared against the NHS 111 responses. The score of ≥1 graded as \\\"Good\\\" was achieved by 93% responses of all LLMs. This refers to at least part of this answer having correct information as well as absence of any wrong information. There was no marked difference and very similar results seen overall on both prompts.</p><p><strong>Conclusions: </strong>The high accuracy and safety observed in LLM responses support their potential as effective tools for providing timely information and guidance to patients. LLMs hold promise in enhancing patient care and healthcare accessibility in digital age.</p>\",\"PeriodicalId\":12125,\"journal\":{\"name\":\"Eye\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-01-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Eye\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1038/s41433-025-03605-8\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eye","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41433-025-03605-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:本研究对各种大型语言模型在眼科急诊响应中的表现进行了全面评估,并将其准确性与英国国家卫生服务111在线系统进行了比较。方法:我们从NHS 111分诊算法中纳入了21个与眼科相关的紧急情况问题。这些问题是基于NHS 111算法中列出的四个不同的眼科急诊主题。从NHS 111在线生成的响应,与不同的法学硕士聊天机器人响应进行比较,以确定法学硕士响应的准确性。我们包括了一系列的模型,包括ChatGPT-3.5, b谷歌巴德,必应聊天,和ChatGPT-4.0。使用双提示策略将每个llm聊天机器人响应的准确性与NHS 111分类进行比较。答案的评分如下:-2为“很差”,-1为“差”,0为“无回应”,1为“好”,2为“很好”,3为“优秀”。结果:与NHS 111应答相比,LLMs在本研究中获得了良好的准确性。93%的法学硕士达到≥1分的“良好”评分。这是指答案中至少有一部分信息是正确的,没有任何错误的信息。在两种提示上没有明显的差异,总体上看到的结果非常相似。结论:在LLM反应中观察到的高准确性和安全性支持了它们作为向患者提供及时信息和指导的有效工具的潜力。法学硕士有望在数字时代提高患者护理和医疗保健可及性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
"Comparative analysis of large language models against the NHS 111 online triaging for emergency ophthalmology".

Background: This study presents a comprehensive evaluation of the performance of various large language models in generating responses for ophthalmology emergencies and compares their accuracy with the established United Kingdom's National Health Service 111 online system.

Methods: We included 21 ophthalmology-related emergency scenario questions from the NHS 111 triaging algorithm. These questions were based on four different ophthalmology emergency themes as laid out in the NHS 111 algorithm. Responses generated from NHS 111 online, were compared to different LLM-chatbots responses to determine the accuracy of LLM responses. We included a range of models including ChatGPT-3.5, Google Bard, Bing Chat, and ChatGPT-4.0. The accuracy of each LLM-chatbot response was compared against the NHS 111 Triage using a two-prompt strategy. Answers were graded as following: -2 graded as "Very poor", -1 as "Poor", O as "No response", 1 as "Good", 2 as "Very good" and 3 graded as "Excellent".

Results: Overall LLMs' attained a good accuracy in this study compared against the NHS 111 responses. The score of ≥1 graded as "Good" was achieved by 93% responses of all LLMs. This refers to at least part of this answer having correct information as well as absence of any wrong information. There was no marked difference and very similar results seen overall on both prompts.

Conclusions: The high accuracy and safety observed in LLM responses support their potential as effective tools for providing timely information and guidance to patients. LLMs hold promise in enhancing patient care and healthcare accessibility in digital age.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Eye
Eye 医学-眼科学
CiteScore
6.40
自引率
5.10%
发文量
481
审稿时长
3-6 weeks
期刊介绍: Eye seeks to provide the international practising ophthalmologist with high quality articles, of academic rigour, on the latest global clinical and laboratory based research. Its core aim is to advance the science and practice of ophthalmology with the latest clinical- and scientific-based research. Whilst principally aimed at the practising clinician, the journal contains material of interest to a wider readership including optometrists, orthoptists, other health care professionals and research workers in all aspects of the field of visual science worldwide. Eye is the official journal of The Royal College of Ophthalmologists. Eye encourages the submission of original articles covering all aspects of ophthalmology including: external eye disease; oculo-plastic surgery; orbital and lacrimal disease; ocular surface and corneal disorders; paediatric ophthalmology and strabismus; glaucoma; medical and surgical retina; neuro-ophthalmology; cataract and refractive surgery; ocular oncology; ophthalmic pathology; ophthalmic genetics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信