Comparing SVR and Random Forest Forecasting based on Autoregressive Time Series with Application

N. Fadhil, Zinah ALbazzaz
{"title":"Comparing SVR and Random Forest Forecasting based on Autoregressive Time Series with Application","authors":"N. Fadhil, Zinah ALbazzaz","doi":"10.33899/iqjoss.2023.181220","DOIUrl":null,"url":null,"abstract":"The accuracy of forecasting the time series of relative humidity in its maximum and minimum cases is important for controlling environmental impacts, damages and risks. In this study, the support vector regression (SVR) method and the random forest (RF) method will be used, depending on the principle of auto regressive (AR) and the autocorrelation (AC), which is the main characteristic of time series in general. The Lags of original time series will be depended as the explanatory (input) variables while the original series will be as target variable. This structure is fitted with the AC principle because the current observation will be depending on time lags in each time step of time series variable. Comparisons of the forecasting results will be performed by using SVR , RF methods and compared to the classical method of analysing time series which is the integrated autoregressive and moving average (ARIMA) model. The SVR and RF methods were employed due to their importance in improving the forecast results, as they are the ideal solution to the problem of non-linearity of the data, as well as the problem of heterogeneity in the climate data, especially as a result of the fact that they contain many seasonal and periodic compounds, which may lead to inaccurate forecast. The forecast of the time series of relative humidity in its minimum and maximum cases was studied in this study for one of the agricultural meteorological stations in the city of Mosul-Iraq. The results of this study reflected the superiority of both SVR method and RF method compared to the classical method represented by the ARIMA model. The results also included the superiority of the RF method in forecasting the training period compared to the SVR method, which was more balanced despite that, as it superiority the results of ARIMA in forecasting the training period and the testing period, while it was its forecast performance is slightly better than the forecast results of the RF method in the test period.","PeriodicalId":351789,"journal":{"name":"IRAQI JOURNAL OF STATISTICAL SCIENCES","volume":"118 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IRAQI JOURNAL OF STATISTICAL SCIENCES","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33899/iqjoss.2023.181220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The accuracy of forecasting the time series of relative humidity in its maximum and minimum cases is important for controlling environmental impacts, damages and risks. In this study, the support vector regression (SVR) method and the random forest (RF) method will be used, depending on the principle of auto regressive (AR) and the autocorrelation (AC), which is the main characteristic of time series in general. The Lags of original time series will be depended as the explanatory (input) variables while the original series will be as target variable. This structure is fitted with the AC principle because the current observation will be depending on time lags in each time step of time series variable. Comparisons of the forecasting results will be performed by using SVR , RF methods and compared to the classical method of analysing time series which is the integrated autoregressive and moving average (ARIMA) model. The SVR and RF methods were employed due to their importance in improving the forecast results, as they are the ideal solution to the problem of non-linearity of the data, as well as the problem of heterogeneity in the climate data, especially as a result of the fact that they contain many seasonal and periodic compounds, which may lead to inaccurate forecast. The forecast of the time series of relative humidity in its minimum and maximum cases was studied in this study for one of the agricultural meteorological stations in the city of Mosul-Iraq. The results of this study reflected the superiority of both SVR method and RF method compared to the classical method represented by the ARIMA model. The results also included the superiority of the RF method in forecasting the training period compared to the SVR method, which was more balanced despite that, as it superiority the results of ARIMA in forecasting the training period and the testing period, while it was its forecast performance is slightly better than the forecast results of the RF method in the test period.
基于自回归时间序列的 SVR 和随机森林预测比较及应用
相对湿度最大值和最小值时间序列的预测精度对于控制环境影响、损害和风险非常重要。本研究将使用支持向量回归(SVR)方法和随机森林(RF)方法,这取决于自回归(AR)和自相关(AC)原理,而自回归和自相关是一般时间序列的主要特征。原始时间序列的滞后变量将作为解释(输入)变量,而原始序列将作为目标变量。这种结构符合 AC 原理,因为当前观测值将取决于时间序列变量每个时间步的滞后时间。将使用 SVR 和 RF 方法对预测结果进行比较,并与分析时间序列的经典方法--综合自回归移动平均(ARIMA)模型--进行比较。之所以采用 SVR 和 RF 方法,是因为这两种方法对改善预测结果非常重要,因为它们是解决数据非线性问题以及气候数据异质性问题的理想方法,特别是由于气候数据包含许多季节性和周期性化合物,这可能会导致预测不准确。本研究对伊拉克摩苏尔市一个农业气象站的相对湿度最小和最大时间序列进行了预报。研究结果表明,与以 ARIMA 模型为代表的传统方法相比,SVR 方法和 RF 方法都更具优势。研究结果还包括 RF 方法在训练期的预测结果优于 SVR 方法,尽管 SVR 方法在训练期和测试期的预测结果优于 ARIMA 方法,但 RF 方法在测试期的预测结果略好于 SVR 方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信