Evaluation of Data-driven Hybrid Machine Learning Algorithms for Modelling Daily Reference Evapotranspiration

IF 1.8 4区地球科学 Q4 METEOROLOGY & ATMOSPHERIC SCIENCES

Atmosphere-Ocean Pub Date : 2022-06-20 DOI:10.1080/07055900.2022.2087589

N. L. Kushwaha, Jitendra Rajput, D. Sena, A. Elbeltagi, Dhananjai Singh, I. Mani

{"title":"Evaluation of Data-driven Hybrid Machine Learning Algorithms for Modelling Daily Reference Evapotranspiration","authors":"N. L. Kushwaha, Jitendra Rajput, D. Sena, A. Elbeltagi, Dhananjai Singh, I. Mani","doi":"10.1080/07055900.2022.2087589","DOIUrl":null,"url":null,"abstract":"ABSTRACT Reference evapotranspiration (ET0) is one of the crucial variables used for irrigation scheduling, agricultural production, and water balance studies. This study compares six different models with sequential inclusion of six meteorological input variables such as minimum temperature (Tmin), maximum temperature (Tmax), mean relative humidity (RH), wind speed (SW), sunshine hours (HSS), and solar radiation (RS), which are necessarily used in physical or empirical-based models to estimate ET0. Each model utilized three variants of machine learning algorithms, i.e. Additive Regression (AdR), Random Subspace (RSS), M5 Pruning tree (M5P) independently and four novel permutated hybrid combinations of these algorithms. To evaluate the efficacy of these hybridizations and the stability of machine learning models, a comprehensive evaluation of independent and hybrid models was performed. With more input variables, the model performances were found to be superior in terms of prediction accuracies. The model AdR6 that included all the 6 selected meteorological variables outperformed other models during the testing period, exhibiting statistical performance of MAPE (1.30), RMSE (0.07), RAE (2.41), RRSE (3.10), and R 2 (0.998). However, the AdR algorithm, alone, was found to capture about 86% of variance in the observed data conforming to the 95% confidence band across all models irrespective of the number of input variables used to predict ET0. The RSS algorithm, in comparison to other algorithms, failed to capture the observed trends even with all the input variables. The hybrid combinations of algorithms with AdR as a constituent were better performers in terms of their prediction accuracies but remained inferior to AdR as an individual performer. All the algorithms are better predictors of the higher values of ET0 that included values beyond the 75% quartile.","PeriodicalId":55434,"journal":{"name":"Atmosphere-Ocean","volume":"60 1","pages":"519 - 540"},"PeriodicalIF":1.8000,"publicationDate":"2022-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmosphere-Ocean","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1080/07055900.2022.2087589","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}

引用次数: 16

Abstract

ABSTRACT Reference evapotranspiration (ET0) is one of the crucial variables used for irrigation scheduling, agricultural production, and water balance studies. This study compares six different models with sequential inclusion of six meteorological input variables such as minimum temperature (Tmin), maximum temperature (Tmax), mean relative humidity (RH), wind speed (SW), sunshine hours (HSS), and solar radiation (RS), which are necessarily used in physical or empirical-based models to estimate ET0. Each model utilized three variants of machine learning algorithms, i.e. Additive Regression (AdR), Random Subspace (RSS), M5 Pruning tree (M5P) independently and four novel permutated hybrid combinations of these algorithms. To evaluate the efficacy of these hybridizations and the stability of machine learning models, a comprehensive evaluation of independent and hybrid models was performed. With more input variables, the model performances were found to be superior in terms of prediction accuracies. The model AdR6 that included all the 6 selected meteorological variables outperformed other models during the testing period, exhibiting statistical performance of MAPE (1.30), RMSE (0.07), RAE (2.41), RRSE (3.10), and R 2 (0.998). However, the AdR algorithm, alone, was found to capture about 86% of variance in the observed data conforming to the 95% confidence band across all models irrespective of the number of input variables used to predict ET0. The RSS algorithm, in comparison to other algorithms, failed to capture the observed trends even with all the input variables. The hybrid combinations of algorithms with AdR as a constituent were better performers in terms of their prediction accuracies but remained inferior to AdR as an individual performer. All the algorithms are better predictors of the higher values of ET0 that included values beyond the 75% quartile.

查看原文本刊更多论文

数据驱动的混合机器学习算法用于模拟每日参考蒸发蒸腾量的评估

参考蒸散量（ET0）是灌溉调度、农业生产和水平衡研究中使用的关键变量之一。本研究比较了六个不同的模型，其中依次包括六个气象输入变量，如最低温度（Tmin）、最高温度（Tmax）、平均相对湿度（RH）、风速（SW）、日照时数（HSS）和太阳辐射（RS），这些变量在物理或经验模型中必须用于估计ET0。每个模型独立地使用了三种机器学习算法变体，即加性回归（AdR）、随机子空间（RSS）、M5修剪树（M5P），以及这些算法的四种新的置换混合组合。为了评估这些混合的有效性和机器学习模型的稳定性，对独立模型和混合模型进行了综合评估。在输入变量较多的情况下，发现模型的预测精度较高。包括所有6个选定气象变量的模型AdR6在测试期间优于其他模型，表现出MAPE（1.30）、RMSE（0.07）、RAE（2.41）、RRSE（3.10）和R2（0.998）的统计性能。然而，单独的AdR算法，发现在所有模型中，无论用于预测ET0的输入变量的数量如何，都捕获了符合95%置信区间的观测数据中约86%的方差。与其他算法相比，RSS算法即使使用所有输入变量也未能捕捉到观察到的趋势。以AdR为组成部分的算法的混合组合在预测精度方面表现更好，但仍不如AdR作为个体表现者。所有算法都是ET0较高值的更好预测因子，其中包括超过75%四分位数的值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Atmosphere-Ocean 地学-海洋学

CiteScore

2.50

自引率

16.70%

发文量

审稿时长

>12 weeks

期刊介绍： Atmosphere-Ocean is the principal scientific journal of the Canadian Meteorological and Oceanographic Society (CMOS). It contains results of original research, survey articles, notes and comments on published papers in all fields of the atmospheric, oceanographic and hydrological sciences. Arctic, coastal and mid- to high-latitude regions are areas of particular interest. Applied or fundamental research contributions in English or French on the following topics are welcomed: climate and climatology; observation technology, remote sensing; forecasting, modelling, numerical methods; physics, dynamics, chemistry, biogeochemistry; boundary layers, pollution, aerosols; circulation, cloud physics, hydrology, air-sea interactions; waves, ice, energy exchange and related environmental topics.