大型语言模型是否有助于儿童给药的准确性?

IF 3.1 3区 医学 Q1 PEDIATRICS
Chedva Levin, Brurya Orkaby, Erika Kerner, Mor Saban
{"title":"大型语言模型是否有助于儿童给药的准确性?","authors":"Chedva Levin, Brurya Orkaby, Erika Kerner, Mor Saban","doi":"10.1038/s41390-025-03980-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objective: </strong>Medication errors in pediatric care remain a significant healthcare challenge despite technological advancements, necessitating innovative approaches. This study aims to evaluate Large Language Models' (LLMs) potential in reducing pediatric medication dosage calculation errors compared to experienced nurses.</p><p><strong>Methods: </strong>This cross-sectional study (June-August 2024) involved 101 nurses from pediatric and neonatal departments and three LLMs (ChatGPT-4o, Claude-3.0, Llama 3 8B). Participants completed a nine-question survey on pediatric medication calculations. Primary outcomes were accuracy and response time. Secondary measures included seniority and group membership on accuracy.</p><p><strong>Results: </strong>Significant differences (P < 0.001) were observed between nurses and LLMs. Nurses averaged 93.14 ± 9.39 accuracy. Claude-3.0 and ChatGPT-4o achieved 100 accuracy, while Llama 3 8B was 66 accurate. LLMs were faster (15.7-75.12 seconds) than nurses (1621.2 ± 8379.3 s). The Generalized Linear Model analysis revealed task performance was significantly influenced by duration (Wald χ² = 27,881.261, p < 0.001) and interaction between relative seniority and group membership (Wald χ² = 3,938.250, p < 0.001), with participants achieving a mean total grade of 91.03 (SD = 13.87).</p><p><strong>Conclusions: </strong>Claude-3.0 and ChatGPT-4o demonstrated perfect accuracy and rapid calculation capabilities, showing promise in reducing pediatric medication dosage errors. Further research is needed to explore their integration into practice.</p><p><strong>Impact: </strong>Key Message Large Language Models (LLMs) like ChatGPT-4o and Claude-3.0 demonstrate perfect accuracy and significantly faster response times in pediatric medication dosage calculations, showing potential to reduce errors and save time. Addition to Existing Literature This study provides novel insights by quantitatively comparing LLM performance with experienced nurses, contributing to the understanding of AI's role in improving medication safety. Impact The findings emphasize the value of LLMs as supplemental tools in healthcare, particularly in high-stakes pediatric care, where they can reduce calculation errors and improve clinical efficiency.</p>","PeriodicalId":19829,"journal":{"name":"Pediatric Research","volume":" ","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Can large language models assist with pediatric dosing accuracy?\",\"authors\":\"Chedva Levin, Brurya Orkaby, Erika Kerner, Mor Saban\",\"doi\":\"10.1038/s41390-025-03980-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background and objective: </strong>Medication errors in pediatric care remain a significant healthcare challenge despite technological advancements, necessitating innovative approaches. This study aims to evaluate Large Language Models' (LLMs) potential in reducing pediatric medication dosage calculation errors compared to experienced nurses.</p><p><strong>Methods: </strong>This cross-sectional study (June-August 2024) involved 101 nurses from pediatric and neonatal departments and three LLMs (ChatGPT-4o, Claude-3.0, Llama 3 8B). Participants completed a nine-question survey on pediatric medication calculations. Primary outcomes were accuracy and response time. Secondary measures included seniority and group membership on accuracy.</p><p><strong>Results: </strong>Significant differences (P < 0.001) were observed between nurses and LLMs. Nurses averaged 93.14 ± 9.39 accuracy. Claude-3.0 and ChatGPT-4o achieved 100 accuracy, while Llama 3 8B was 66 accurate. LLMs were faster (15.7-75.12 seconds) than nurses (1621.2 ± 8379.3 s). The Generalized Linear Model analysis revealed task performance was significantly influenced by duration (Wald χ² = 27,881.261, p < 0.001) and interaction between relative seniority and group membership (Wald χ² = 3,938.250, p < 0.001), with participants achieving a mean total grade of 91.03 (SD = 13.87).</p><p><strong>Conclusions: </strong>Claude-3.0 and ChatGPT-4o demonstrated perfect accuracy and rapid calculation capabilities, showing promise in reducing pediatric medication dosage errors. Further research is needed to explore their integration into practice.</p><p><strong>Impact: </strong>Key Message Large Language Models (LLMs) like ChatGPT-4o and Claude-3.0 demonstrate perfect accuracy and significantly faster response times in pediatric medication dosage calculations, showing potential to reduce errors and save time. Addition to Existing Literature This study provides novel insights by quantitatively comparing LLM performance with experienced nurses, contributing to the understanding of AI's role in improving medication safety. Impact The findings emphasize the value of LLMs as supplemental tools in healthcare, particularly in high-stakes pediatric care, where they can reduce calculation errors and improve clinical efficiency.</p>\",\"PeriodicalId\":19829,\"journal\":{\"name\":\"Pediatric Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pediatric Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1038/s41390-025-03980-8\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PEDIATRICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pediatric Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41390-025-03980-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PEDIATRICS","Score":null,"Total":0}
引用次数: 0

摘要

背景和目的:尽管技术进步,但儿科护理中的用药错误仍然是一个重大的医疗挑战,需要创新的方法。本研究旨在评估与经验丰富的护士相比,大语言模型(LLMs)在减少儿科用药剂量计算错误方面的潜力。方法:本横断面研究(2024年6月- 8月)涉及101名儿科和新生儿科护士和3名法学硕士(chatgpt - 40, Claude-3.0, Llama 38b)。参与者完成了一项关于儿科药物计算的9个问题的调查。主要结果为准确性和反应时间。次要测量包括资历和团队成员对准确性的影响。结果:差异有统计学意义(P)结论:Claude-3.0和chatgpt - 40具有较好的准确性和快速计算能力,有望减少儿童用药剂量错误。如何将其与实践相结合还需要进一步的研究。像chatgpt - 40和Claude-3.0这样的大型语言模型(llm)在儿科药物剂量计算中表现出完美的准确性和显着更快的响应时间,显示出减少错误和节省时间的潜力。本研究通过定量比较法学硕士和经验丰富的护士的表现,提供了新的见解,有助于理解人工智能在提高用药安全方面的作用。研究结果强调了llm作为医疗保健补充工具的价值,特别是在高风险的儿科护理中,llm可以减少计算错误并提高临床效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Can large language models assist with pediatric dosing accuracy?

Background and objective: Medication errors in pediatric care remain a significant healthcare challenge despite technological advancements, necessitating innovative approaches. This study aims to evaluate Large Language Models' (LLMs) potential in reducing pediatric medication dosage calculation errors compared to experienced nurses.

Methods: This cross-sectional study (June-August 2024) involved 101 nurses from pediatric and neonatal departments and three LLMs (ChatGPT-4o, Claude-3.0, Llama 3 8B). Participants completed a nine-question survey on pediatric medication calculations. Primary outcomes were accuracy and response time. Secondary measures included seniority and group membership on accuracy.

Results: Significant differences (P < 0.001) were observed between nurses and LLMs. Nurses averaged 93.14 ± 9.39 accuracy. Claude-3.0 and ChatGPT-4o achieved 100 accuracy, while Llama 3 8B was 66 accurate. LLMs were faster (15.7-75.12 seconds) than nurses (1621.2 ± 8379.3 s). The Generalized Linear Model analysis revealed task performance was significantly influenced by duration (Wald χ² = 27,881.261, p < 0.001) and interaction between relative seniority and group membership (Wald χ² = 3,938.250, p < 0.001), with participants achieving a mean total grade of 91.03 (SD = 13.87).

Conclusions: Claude-3.0 and ChatGPT-4o demonstrated perfect accuracy and rapid calculation capabilities, showing promise in reducing pediatric medication dosage errors. Further research is needed to explore their integration into practice.

Impact: Key Message Large Language Models (LLMs) like ChatGPT-4o and Claude-3.0 demonstrate perfect accuracy and significantly faster response times in pediatric medication dosage calculations, showing potential to reduce errors and save time. Addition to Existing Literature This study provides novel insights by quantitatively comparing LLM performance with experienced nurses, contributing to the understanding of AI's role in improving medication safety. Impact The findings emphasize the value of LLMs as supplemental tools in healthcare, particularly in high-stakes pediatric care, where they can reduce calculation errors and improve clinical efficiency.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pediatric Research
Pediatric Research 医学-小儿科
CiteScore
6.80
自引率
5.60%
发文量
473
审稿时长
3-8 weeks
期刊介绍: Pediatric Research publishes original papers, invited reviews, and commentaries on the etiologies of children''s diseases and disorders of development, extending from molecular biology to epidemiology. Use of model organisms and in vitro techniques relevant to developmental biology and medicine are acceptable, as are translational human studies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信