{"title":"Pre-trained large language models outperform statistics and machine learning forecasting visits in the emergency departments","authors":"Yi-Chang Yen , Chin-Chieh Wu , Shu-Hui Chen , Kuan-Fu Chen","doi":"10.1016/j.ajem.2025.09.008","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><div>The unpredictability of emergency department (ED) visits is a significant reason for ED crowding. This study aims to compare different conventional statistical learning, machine learning, and large language models (LLM) methods to forecast daily ED visits at primary, secondary, and tertiary hospitals across different regions in Taiwan, during the pre-COVID-19, COVID-19, and post-pandemic periods.</div></div><div><h3>Methods</h3><div>Daily ED visits records from 2007 to 2022, derived from the electronic medical records of six hospitals across Taiwan, were combined with calendar data and COVID-19 pandemic indicators to serve as input features. The primary objective was to develop models for a seven-day forecast horizon. Both statistical models, such as SARIMAX and Prophet, and machine learning models, including Light Gradient Boosting Machine (LightGBM), Long Short-Term Memory (LSTM), DLinear, and Time-series Dense Encoder (TiDE), were developed using training datasets from 2007 to 2017. We then compared the performances of the statistical models, machine learning models, and a pre-trained transformer-based LLM on the testing set (2018–2022), which included the pre-COVID-19, COVID-19, and post-pandemic periods. We used the Mean Absolute Percentage Error (MAPE), defined as the percentage difference between the predicted and actual values, as the metric.</div></div><div><h3>Results</h3><div>A total of 7,540,271 ED visits and 31,064 data points were recorded across two tertiary, three regional, and one primary hospital. The conventional statistical models revealed a significant seven-day cycle pattern in ED visit data across the hospitals. Daily ED visits surged significantly during the COVID-19 pandemic. The pre-trained LLM demonstrated the best overall performance (MAPE 7.59, 95 % CI: 7.20–7.99), closely followed by LightGBM (MAPE 8.08, 95 % CI: 7.67–8.50). Specifically, TiDE (MAPE 5.89, 95 % CI: 5.48–6.29) and Prophet (MAPE 6.80, 95 % CI: 6.41–7.18) performed best in the pre-COVID-19 period. Although the abrupt changes during COVID-19 led to declines in the performance of all models, the pre-trained LLM and LightGBM demonstrated resilience, with MAPEs of 9.03 (95 % CI 8.32, 9.77) and 10.60 (95 % CI 9.77, 11.52), respectively.</div></div><div><h3>Conclusions</h3><div>The pre-trained LLM showed superior overall performance in forecasting ED visits, particularly during the pandemic and post-pandemic periods. LightGBM performed relatively well across all periods. Prophet and TiDE demonstrated favorable and stable performance only during the pre-pandemic period. These findings underscore the potential of advanced time series models to improve ED visit forecasts.</div></div>","PeriodicalId":55536,"journal":{"name":"American Journal of Emergency Medicine","volume":"98 ","pages":"Pages 298-308"},"PeriodicalIF":2.2000,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Emergency Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0735675725006199","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives
The unpredictability of emergency department (ED) visits is a significant reason for ED crowding. This study aims to compare different conventional statistical learning, machine learning, and large language models (LLM) methods to forecast daily ED visits at primary, secondary, and tertiary hospitals across different regions in Taiwan, during the pre-COVID-19, COVID-19, and post-pandemic periods.
Methods
Daily ED visits records from 2007 to 2022, derived from the electronic medical records of six hospitals across Taiwan, were combined with calendar data and COVID-19 pandemic indicators to serve as input features. The primary objective was to develop models for a seven-day forecast horizon. Both statistical models, such as SARIMAX and Prophet, and machine learning models, including Light Gradient Boosting Machine (LightGBM), Long Short-Term Memory (LSTM), DLinear, and Time-series Dense Encoder (TiDE), were developed using training datasets from 2007 to 2017. We then compared the performances of the statistical models, machine learning models, and a pre-trained transformer-based LLM on the testing set (2018–2022), which included the pre-COVID-19, COVID-19, and post-pandemic periods. We used the Mean Absolute Percentage Error (MAPE), defined as the percentage difference between the predicted and actual values, as the metric.
Results
A total of 7,540,271 ED visits and 31,064 data points were recorded across two tertiary, three regional, and one primary hospital. The conventional statistical models revealed a significant seven-day cycle pattern in ED visit data across the hospitals. Daily ED visits surged significantly during the COVID-19 pandemic. The pre-trained LLM demonstrated the best overall performance (MAPE 7.59, 95 % CI: 7.20–7.99), closely followed by LightGBM (MAPE 8.08, 95 % CI: 7.67–8.50). Specifically, TiDE (MAPE 5.89, 95 % CI: 5.48–6.29) and Prophet (MAPE 6.80, 95 % CI: 6.41–7.18) performed best in the pre-COVID-19 period. Although the abrupt changes during COVID-19 led to declines in the performance of all models, the pre-trained LLM and LightGBM demonstrated resilience, with MAPEs of 9.03 (95 % CI 8.32, 9.77) and 10.60 (95 % CI 9.77, 11.52), respectively.
Conclusions
The pre-trained LLM showed superior overall performance in forecasting ED visits, particularly during the pandemic and post-pandemic periods. LightGBM performed relatively well across all periods. Prophet and TiDE demonstrated favorable and stable performance only during the pre-pandemic period. These findings underscore the potential of advanced time series models to improve ED visit forecasts.
期刊介绍:
A distinctive blend of practicality and scholarliness makes the American Journal of Emergency Medicine a key source for information on emergency medical care. Covering all activities concerned with emergency medicine, it is the journal to turn to for information to help increase the ability to understand, recognize and treat emergency conditions. Issues contain clinical articles, case reports, review articles, editorials, international notes, book reviews and more.