ForecastingPub Date : 2024-04-02DOI: 10.3390/forecast6020015
Fernando Ferreira Lima dos Santos, Farzaneh Khorsandi
{"title":"Riding into Danger: Predictive Modeling for ATV-Related Injuries and Seasonal Patterns","authors":"Fernando Ferreira Lima dos Santos, Farzaneh Khorsandi","doi":"10.3390/forecast6020015","DOIUrl":"https://doi.org/10.3390/forecast6020015","url":null,"abstract":"All-Terrain Vehicles (ATVs) are popular off-road vehicles in the United States, with a staggering 10.5 million households reported to own at least one ATV. Despite their popularity, ATVs pose a significant risk of severe injuries, leading to substantial healthcare expenses and raising public health concerns. As such, gaining insights into the patterns of ATV-related hospitalizations and accurately predicting these injuries is of paramount importance. This knowledge can guide the development of effective prevention strategies, ultimately mitigating ATV-related injuries and the associated healthcare costs. Therefore, we performed an in-depth analysis of ATV-related hospitalizations from 2010 to 2021. Furthermore, we developed and assessed the performance of three forecasting models—Neural Prophet, SARIMA, and LSTM—to predict ATV-related injuries. The performance of these models was evaluated using the Root Mean Square Error (RMSE) accuracy metric. As a result, the LSTM model outperformed the others and could be used to provide valuable insights that can aid in strategic planning and resource allocation within healthcare systems. In addition, our findings highlight the urgent need for prevention programs that are specifically targeted toward youth and timed for the summer season.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"92 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140752634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-03-25DOI: 10.3390/forecast6020014
A. Lekidis, Angelos Georgakis, Christos Dalamagkas, Elpiniki I. Papageorgiou
{"title":"Predictive Maintenance Framework for Fault Detection in Remote Terminal Units","authors":"A. Lekidis, Angelos Georgakis, Christos Dalamagkas, Elpiniki I. Papageorgiou","doi":"10.3390/forecast6020014","DOIUrl":"https://doi.org/10.3390/forecast6020014","url":null,"abstract":"The scheduled maintenance of industrial equipment is usually performed with a low frequency, as it usually leads to unpredicted downtime in business operations. Nevertheless, this confers a risk of failure in individual modules of the equipment, which may diminish its performance or even lead to its breakdown, rendering it non-operational. Lately, predictive maintenance methods have been considered for industrial systems, such as power generation stations, as a proactive measure for preventing failures. Such methods use data gathered from industrial equipment and Machine Learning (ML) algorithms to identify data patterns that indicate anomalies and may lead to potential failures. However, industrial equipment exhibits specific behavior and interactions that originate from its configuration from the manufacturer and the system that is installed, which constitutes a great challenge for the effectiveness of ML model maintenance and failure predictions. In this article, we propose a novel method for tackling this challenge based on the development of a digital twin for industrial equipment known as a Remote Terminal Unit (RTU). RTUs are used in electrical systems to provide the remote monitoring and control of critical equipment, such as power generators. The method is applied in an RTU that is connected to a real power generator within a Public Power Corporation (PPC) facility, where operational anomalies are forecasted based on measurements of its processing power, operating temperature, voltage, and storage memory.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":" September","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140383494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective Natural Language Processing Algorithms for Early Alerts of Gout Flares from Chief Complaints","authors":"Lucas Lopes Oliveira, Xiaorui Jiang, Aryalakshmi Nellippillipathil Babu, Poonam Karajagi, Alireza Daneshkhah","doi":"10.3390/forecast6010013","DOIUrl":"https://doi.org/10.3390/forecast6010013","url":null,"abstract":"Early identification of acute gout is crucial, enabling healthcare professionals to implement targeted interventions for rapid pain relief and preventing disease progression, ensuring improved long-term joint function. In this study, we comprehensively explored the potential early detection of gout flares (GFs) based on nurses’ chief complaint notes in the Emergency Department (ED). Addressing the challenge of identifying GFs prospectively during an ED visit, where documentation is typically minimal, our research focused on employing alternative Natural Language Processing (NLP) techniques to enhance detection accuracy. We investigated GF detection algorithms using both sparse representations by traditional NLP methods and dense encodings by medical domain-specific Large Language Models (LLMs), distinguishing between generative and discriminative models. Three methods were used to alleviate the issue of severe data imbalances, including oversampling, class weights, and focal loss. Extensive empirical studies were performed on the Gout Emergency Department Chief Complaint Corpora. Sparse text representations like tf-idf proved to produce strong performances, achieving F1 scores higher than 0.75. The best deep learning models were RoBERTa-large-PM-M3-Voc and BioGPT, which had the best F1 scores for each dataset, with a 0.8 on the 2019 dataset and a 0.85 F1 score on the 2020 dataset, respectively. We concluded that although discriminative LLMs performed better for this classification task when compared to generative LLMs, a combination of using generative models as feature extractors and employing a support vector machine for classification yielded promising results comparable to those obtained with discriminative models.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"52 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140254685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-02-16DOI: 10.3390/forecast6010010
K. P. Fourkiotis, Athanasios Tsadiras
{"title":"Applying Machine Learning and Statistical Forecasting Methods for Enhancing Pharmaceutical Sales Predictions","authors":"K. P. Fourkiotis, Athanasios Tsadiras","doi":"10.3390/forecast6010010","DOIUrl":"https://doi.org/10.3390/forecast6010010","url":null,"abstract":"In today’s evolving global world, the pharmaceutical sector faces an emerging challenge, which is the rapid surge of the global population and the consequent growth in drug production demands. Recognizing this, our study explores the urgent need to strengthen pharmaceutical production capacities, ensuring drugs are allocated and stored strategically to meet diverse regional and demographic needs. Summarizing our key findings, our research focuses on the promising area of drug demand forecasting using artificial intelligence (AI) and machine learning (ML) techniques to enhance predictions in the pharmaceutical field. Supplied with a rich dataset from Kaggle spanning 600,000 sales records from a singular pharmacy, our study embarks on a thorough exploration of univariate time series analysis. Here, we pair conventional analytical tools such as ARIMA with advanced methodologies like LSTM neural networks, all with a singular vision: refining the precision of our sales. Venturing deeper, our data underwent categorisation and were segmented into eight clusters premised on the ATC Anatomical Therapeutic Chemical (ATC) Classification System framework. This segmentation unravels the evident influence of seasonality on drug sales. The analysis not only highlights the effectiveness of machine learning models but also illuminates the remarkable success of XGBoost. This algorithm outperformed traditional models, achieving the lowest MAPE values: 17.89% for M01AB (anti-inflammatory and antirheumatic products, non-steroids, acetic acid derivatives, and related substances), 16.92% for M01AE (anti-inflammatory and antirheumatic products, non-steroids, and propionic acid derivatives), 17.98% for N02BA (analgesics, antipyretics, and anilides), and 16.05% for N02BE (analgesics, antipyretics, pyrazolones, and anilides). XGBoost further demonstrated exceptional precision with the lowest MSE scores: 28.8 for M01AB, 1518.56 for N02BE, and 350.84 for N05C (hypnotics and sedatives). Additionally, the Seasonal Naïve model recorded an MSE of 49.19 for M01AE, while the Single Exponential Smoothing model showed an MSE of 7.19 for N05B. These findings underscore the strengths derived from employing a diverse range of approaches within the forecasting series. In summary, our research accentuates the significance of leveraging machine learning techniques to derive valuable insights for pharmaceutical companies. By applying the power of these methods, companies can optimize their production, storage, distribution, and marketing practices.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"687 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140453869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-02-12DOI: 10.3390/forecast6010009
Yoga Sasmita, Heri Kuswanto, D. Prastyo
{"title":"State-Dependent Model Based on Singular Spectrum Analysis Vector for Modeling Structural Breaks: Forecasting Indonesian Export","authors":"Yoga Sasmita, Heri Kuswanto, D. Prastyo","doi":"10.3390/forecast6010009","DOIUrl":"https://doi.org/10.3390/forecast6010009","url":null,"abstract":"Standard time-series modeling requires the stability of model parameters over time. The instability of model parameters is often caused by structural breaks, leading to the formation of nonlinear models. A state-dependent model (SDM) is a more general and flexible scheme in nonlinear modeling. On the other hand, time-series data often exhibit multiple frequency components, such as trends, seasonality, cycles, and noise. These frequency components can be optimized in forecasting using Singular Spectrum Analysis (SSA). Furthermore, the two most widely used approaches in SSA are Linear Recurrent Formula (SSAR) and Vector (SSAV). SSAV has better accuracy and robustness than SSAR, especially in handling structural breaks. Therefore, this research proposes modeling the SSAV coefficient with an SDM approach to take structural breaks called SDM-SSAV. SDM recursively updates the SSAV coefficient to adapt over time and between states using an Extended Kalman Filter (EKF). Empirical results with Indonesian Export data and simulation studies show that the accuracy of SDM-SSAV outperforms SSAR, SSAV, SDM-SSAR, hybrid ARIMA-LSTM, and VARI.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"131 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139843007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-02-12DOI: 10.3390/forecast6010009
Yoga Sasmita, Heri Kuswanto, D. Prastyo
{"title":"State-Dependent Model Based on Singular Spectrum Analysis Vector for Modeling Structural Breaks: Forecasting Indonesian Export","authors":"Yoga Sasmita, Heri Kuswanto, D. Prastyo","doi":"10.3390/forecast6010009","DOIUrl":"https://doi.org/10.3390/forecast6010009","url":null,"abstract":"Standard time-series modeling requires the stability of model parameters over time. The instability of model parameters is often caused by structural breaks, leading to the formation of nonlinear models. A state-dependent model (SDM) is a more general and flexible scheme in nonlinear modeling. On the other hand, time-series data often exhibit multiple frequency components, such as trends, seasonality, cycles, and noise. These frequency components can be optimized in forecasting using Singular Spectrum Analysis (SSA). Furthermore, the two most widely used approaches in SSA are Linear Recurrent Formula (SSAR) and Vector (SSAV). SSAV has better accuracy and robustness than SSAR, especially in handling structural breaks. Therefore, this research proposes modeling the SSAV coefficient with an SDM approach to take structural breaks called SDM-SSAV. SDM recursively updates the SSAV coefficient to adapt over time and between states using an Extended Kalman Filter (EKF). Empirical results with Indonesian Export data and simulation studies show that the accuracy of SDM-SSAV outperforms SSAR, SSAV, SDM-SSAR, hybrid ARIMA-LSTM, and VARI.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"15 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139782992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-02-05DOI: 10.3390/forecast6010008
Han Lin Shang
{"title":"Bootstrapping Long-Run Covariance of Stationary Functional Time Series","authors":"Han Lin Shang","doi":"10.3390/forecast6010008","DOIUrl":"https://doi.org/10.3390/forecast6010008","url":null,"abstract":"A key summary statistic in a stationary functional time series is the long-run covariance function that measures serial dependence. It can be consistently estimated via a kernel sandwich estimator, which is the core of dynamic functional principal component regression for forecasting functional time series. To measure the uncertainty of the long-run covariance estimation, we consider sieve and functional autoregressive (FAR) bootstrap methods to generate pseudo-functional time series and study variability associated with the long-run covariance. The sieve bootstrap method is nonparametric (i.e., model-free), while the FAR bootstrap method is semi-parametric. The sieve bootstrap method relies on functional principal component analysis to decompose a functional time series into a set of estimated functional principal components and their associated scores. The scores can be bootstrapped via a vector autoregressive representation. The bootstrapped functional time series are obtained by multiplying the bootstrapped scores by the estimated functional principal components. The FAR bootstrap method relies on the FAR of order 1 to model the conditional mean of a functional time series, while residual functions can be bootstrapped via independent and identically distributed resampling. Through a series of Monte Carlo simulations, we evaluate and compare the finite-sample accuracy between the sieve and FAR bootstrap methods for quantifying the estimation uncertainty of the long-run covariance of a stationary functional time series.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"45 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139865747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-02-05DOI: 10.3390/forecast6010008
Han Lin Shang
{"title":"Bootstrapping Long-Run Covariance of Stationary Functional Time Series","authors":"Han Lin Shang","doi":"10.3390/forecast6010008","DOIUrl":"https://doi.org/10.3390/forecast6010008","url":null,"abstract":"A key summary statistic in a stationary functional time series is the long-run covariance function that measures serial dependence. It can be consistently estimated via a kernel sandwich estimator, which is the core of dynamic functional principal component regression for forecasting functional time series. To measure the uncertainty of the long-run covariance estimation, we consider sieve and functional autoregressive (FAR) bootstrap methods to generate pseudo-functional time series and study variability associated with the long-run covariance. The sieve bootstrap method is nonparametric (i.e., model-free), while the FAR bootstrap method is semi-parametric. The sieve bootstrap method relies on functional principal component analysis to decompose a functional time series into a set of estimated functional principal components and their associated scores. The scores can be bootstrapped via a vector autoregressive representation. The bootstrapped functional time series are obtained by multiplying the bootstrapped scores by the estimated functional principal components. The FAR bootstrap method relies on the FAR of order 1 to model the conditional mean of a functional time series, while residual functions can be bootstrapped via independent and identically distributed resampling. Through a series of Monte Carlo simulations, we evaluate and compare the finite-sample accuracy between the sieve and FAR bootstrap methods for quantifying the estimation uncertainty of the long-run covariance of a stationary functional time series.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"23 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139805846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-02-01DOI: 10.3390/forecast6010007
Manuel Zamudio López, H. Zareipour, Mike Quashie
{"title":"Forecasting the Occurrence of Electricity Price Spikes: A Statistical-Economic Investigation Study","authors":"Manuel Zamudio López, H. Zareipour, Mike Quashie","doi":"10.3390/forecast6010007","DOIUrl":"https://doi.org/10.3390/forecast6010007","url":null,"abstract":"This research proposes an investigative experiment employing binary classification for short-term electricity price spike forecasting. Numerical definitions for price spikes are derived from economic and statistical thresholds. The predictive task employs two tree-based machine learning classifiers and a deterministic point forecaster; a statistical regression model. Hyperparameters for the tree-based classifiers are optimized for statistical performance based on recall, precision, and F1-score. The deterministic forecaster is adapted from the literature on electricity price forecasting for the classification task. Additionally, one tree-based model prioritizes interpretability, generating decision rules that are subsequently utilized to produce price spike forecasts. For all models, we evaluate the final statistical and economic predictive performance. The interpretable model is analyzed for the trade-off between performance and interpretability. Numerical results highlight the significance of complementing statistical performance with economic assessment in electricity price spike forecasting. All experiments utilize data from Alberta’s electricity market.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"12 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139686651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ForecastingPub Date : 2024-01-31DOI: 10.3390/forecast6010006
Sabrina De Nardi, C. Carnevale, Sara Raccagni, L. Sangiorgi
{"title":"Data-Driven Models to Forecast the Impact of Temperature Anomalies on Rice Production in Southeast Asia","authors":"Sabrina De Nardi, C. Carnevale, Sara Raccagni, L. Sangiorgi","doi":"10.3390/forecast6010006","DOIUrl":"https://doi.org/10.3390/forecast6010006","url":null,"abstract":"Models are a core element in performing local estimation of the climate change input. In this work, a novel approach to perform a fast downscaling of global temperature anomalies on a regional level is presented. The approach is based on a set of data-driven models linking global temperature anomalies and regional and global emissions to regional temperature anomalies. In particular, due to the limited number of available data, a linear autoregressive structure with exogenous input (ARX) has been considered. To demonstrate their relevance to the existing literature and context, the proposed ARX models have been employed to evaluate the impact of temperature anomalies on rice production in a socially, economically, and climatologically fragile area like Southeast Asia. The results show a significant impact on this region, with estimations strongly in accordance with information presented in the literature from different sources and scientific fields. The work represents a first step towards the development of a fast, data-driven, holistic approach to the climate change impact evaluation problem. The proposed ARX data-driven models reveal a novel and feasible way to downscale global temperature anomalies to regional levels, showing their importance in comprehending global temperature anomalies, emissions, and regional climatic conditions.","PeriodicalId":508737,"journal":{"name":"Forecasting","volume":"497 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140471262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}