{"title":"Emergency Patient Triage Improvement through a Retrieval-Augmented Generation Enhanced Large-Scale Language Model.","authors":"Megumi Yazaki, Satoshi Maki, Takeo Furuya, Ken Inoue, Ko Nagai, Yuki Nagashima, Juntaro Maruyama, Yasunori Toki, Kyota Kitagawa, Shuhei Iwata, Takaki Kitamura, Sho Gushiken, Yuji Noguchi, Masahiro Inoue, Yasuhiro Shiga, Kazuhide Inage, Sumihisa Orita, Takaaki Nakada, Seiji Ohtori","doi":"10.1080/10903127.2024.2374400","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Emergency medical triage is crucial for prioritizing patient care in emergency situations, yet its effectiveness can vary significantly based on the experience and training of the personnel involved. This study aims to evaluate the efficacy of integrating Retrieval Augmented Generation (RAG) with Large Language Models (LLMs), specifically OpenAI's GPT models, to standardize triage procedures and reduce variability in emergency care.</p><p><strong>Methods: </strong>We created 100 simulated triage scenarios based on modified cases from the Japanese National Examination for Emergency Medical Technicians. These scenarios were processed by the RAG-enhanced LLMs, and the models were given patient vital signs, symptoms, and observations from emergency medical services (EMS) teams as inputs. The primary outcome was the accuracy of triage classifications, which was used to compare the performance of the RAG-enhanced LLMs with that of emergency medical technicians and emergency physicians. Secondary outcomes included the rates of under-triage and over-triage.</p><p><strong>Results: </strong>The Generative Pre-trained Transformer 3.5 (GPT-3.5) with RAG model achieved a correct triage rate of 70%, significantly outperforming Emergency Medical Technicians (EMTs) with 35% and 38% correct rates, and emergency physicians with 50% and 47% correct rates (<i>p</i> < 0.05). Additionally, this model demonstrated a substantial reduction in under-triage rates to 8%, compared with 33% for GPT-3.5 without RAG, and 39% for GPT-4 without RAG.</p><p><strong>Conclusions: </strong>The integration of RAG with LLMs shows promise in improving the accuracy and consistency of medical assessments in emergency settings. Further validation in diverse medical settings with broader datasets is necessary to confirm the effectiveness and adaptability of these technologies in live environments.</p>","PeriodicalId":20336,"journal":{"name":"Prehospital Emergency Care","volume":" ","pages":"1-7"},"PeriodicalIF":2.1000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Prehospital Emergency Care","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10903127.2024.2374400","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Emergency medical triage is crucial for prioritizing patient care in emergency situations, yet its effectiveness can vary significantly based on the experience and training of the personnel involved. This study aims to evaluate the efficacy of integrating Retrieval Augmented Generation (RAG) with Large Language Models (LLMs), specifically OpenAI's GPT models, to standardize triage procedures and reduce variability in emergency care.
Methods: We created 100 simulated triage scenarios based on modified cases from the Japanese National Examination for Emergency Medical Technicians. These scenarios were processed by the RAG-enhanced LLMs, and the models were given patient vital signs, symptoms, and observations from emergency medical services (EMS) teams as inputs. The primary outcome was the accuracy of triage classifications, which was used to compare the performance of the RAG-enhanced LLMs with that of emergency medical technicians and emergency physicians. Secondary outcomes included the rates of under-triage and over-triage.
Results: The Generative Pre-trained Transformer 3.5 (GPT-3.5) with RAG model achieved a correct triage rate of 70%, significantly outperforming Emergency Medical Technicians (EMTs) with 35% and 38% correct rates, and emergency physicians with 50% and 47% correct rates (p < 0.05). Additionally, this model demonstrated a substantial reduction in under-triage rates to 8%, compared with 33% for GPT-3.5 without RAG, and 39% for GPT-4 without RAG.
Conclusions: The integration of RAG with LLMs shows promise in improving the accuracy and consistency of medical assessments in emergency settings. Further validation in diverse medical settings with broader datasets is necessary to confirm the effectiveness and adaptability of these technologies in live environments.
期刊介绍:
Prehospital Emergency Care publishes peer-reviewed information relevant to the practice, educational advancement, and investigation of prehospital emergency care, including the following types of articles: Special Contributions - Original Articles - Education and Practice - Preliminary Reports - Case Conferences - Position Papers - Collective Reviews - Editorials - Letters to the Editor - Media Reviews.