Effectiveness of Integrating Ensemble-Based Feature Selection and Novel Gradient Boosted Trees in Runoff Prediction: A Case Study in Vu Gia Thu Bon River Basin, Vietnam

IF 1.9 4区 地球科学 Q2 GEOCHEMISTRY & GEOPHYSICS
Oluwatobi Aiyelokun, Quoc Bao Pham, Oluwafunbi Aiyelokun, Nguyen Thi Thuy Linh, Tirthankar Roy, Duong Tran Anh, Ewa Łupikasza
{"title":"Effectiveness of Integrating Ensemble-Based Feature Selection and Novel Gradient Boosted Trees in Runoff Prediction: A Case Study in Vu Gia Thu Bon River Basin, Vietnam","authors":"Oluwatobi Aiyelokun,&nbsp;Quoc Bao Pham,&nbsp;Oluwafunbi Aiyelokun,&nbsp;Nguyen Thi Thuy Linh,&nbsp;Tirthankar Roy,&nbsp;Duong Tran Anh,&nbsp;Ewa Łupikasza","doi":"10.1007/s00024-024-03486-0","DOIUrl":null,"url":null,"abstract":"<div><p>Traditional rainfall-runoff modeling techniques require large datasets and often an exhaustive calibration process, which is challenging, especially in poorly-gauged basins and resource-limited settings. Therefore, it is necessary to examine new ways of constructing predictive models for runoff that can achieve satisfactory results, while also minimizing the data requirement and model construction time. In this study, the effectiveness of integrating the Random Forest (RF) as an important feature identifier with novel gradient boosted trees to achieve satisfactory results was examined for two adjacent catchments in Vietnam. Antecedent daily runoff in combination with daily and one-day antecedent rainfall was found to significantly influence the runoff at the outlet of the catchments. Categorical Boosting (CatBoost) and Extreme Gradient Boosting (XGBoost) were effective in predicting day-ahead runoff. For instance, CatBoost with NSE, d, r, and R<sup>2</sup> values of 0.92, 0.98, 0.96, and 0.92, respectively, and XGBoost with NSE, d, r, and R<sup>2</sup> values of 0.91, 0.98, 0.96, and 0.92, respectively, are well suited for predicting runoff. A comparative analysis of their results with previous studies revealed that the models were very effective since they were able to better reduce generalization errors at different calibration and validation phases. This study presents the integration of RF and gradient boosted trees as a simplified alternative to computationally expensive and data-intensive physically-based rainfall-runoff models. The practitioners can build upon the experimentation presented in this study to minimize the computational time requirement, construction process complexity, and data requirement, which are often serious constraints in physically-based rainfall-runoff modeling.</p></div>","PeriodicalId":21078,"journal":{"name":"pure and applied geophysics","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"pure and applied geophysics","FirstCategoryId":"89","ListUrlMain":"https://link.springer.com/article/10.1007/s00024-024-03486-0","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

Traditional rainfall-runoff modeling techniques require large datasets and often an exhaustive calibration process, which is challenging, especially in poorly-gauged basins and resource-limited settings. Therefore, it is necessary to examine new ways of constructing predictive models for runoff that can achieve satisfactory results, while also minimizing the data requirement and model construction time. In this study, the effectiveness of integrating the Random Forest (RF) as an important feature identifier with novel gradient boosted trees to achieve satisfactory results was examined for two adjacent catchments in Vietnam. Antecedent daily runoff in combination with daily and one-day antecedent rainfall was found to significantly influence the runoff at the outlet of the catchments. Categorical Boosting (CatBoost) and Extreme Gradient Boosting (XGBoost) were effective in predicting day-ahead runoff. For instance, CatBoost with NSE, d, r, and R2 values of 0.92, 0.98, 0.96, and 0.92, respectively, and XGBoost with NSE, d, r, and R2 values of 0.91, 0.98, 0.96, and 0.92, respectively, are well suited for predicting runoff. A comparative analysis of their results with previous studies revealed that the models were very effective since they were able to better reduce generalization errors at different calibration and validation phases. This study presents the integration of RF and gradient boosted trees as a simplified alternative to computationally expensive and data-intensive physically-based rainfall-runoff models. The practitioners can build upon the experimentation presented in this study to minimize the computational time requirement, construction process complexity, and data requirement, which are often serious constraints in physically-based rainfall-runoff modeling.

Abstract Image

Abstract Image

基于集合的特征选择与新型梯度提升树在径流预测中的整合效果:越南 Vu Gia Thu Bon 河流域案例研究
传统的降雨-径流建模技术需要大量的数据集,通常还需要详尽的校准过程,这具有很大的挑战性,尤其是在测雨条件较差的流域和资源有限的环境中。因此,有必要研究构建径流预测模型的新方法,既能取得令人满意的结果,又能最大限度地减少数据需求和模型构建时间。在本研究中,针对越南两个相邻的集水区,考察了将随机森林(RF)作为重要特征识别器与新型梯度提升树相结合以获得满意结果的有效性。研究发现,前一日径流量与前一日降雨量相结合,对集水区出口处的径流量有显著影响。分类推算(CatBoost)和极梯度推算(XGBoost)对预测当日径流量非常有效。例如,CatBoost 的 NSE、d、r 和 R2 值分别为 0.92、0.98、0.96 和 0.92,XGBoost 的 NSE、d、r 和 R2 值分别为 0.91、0.98、0.96 和 0.92,非常适合预测径流。将其结果与之前的研究结果进行比较分析后发现,这些模型非常有效,因为它们能够在不同的校准和验证阶段更好地减少泛化误差。本研究介绍了射频和梯度提升树的集成,作为计算昂贵和数据密集型物理降雨-径流模型的简化替代方案。实践者可以在本研究提出的实验基础上,最大限度地减少计算时间要求、构建过程复杂性和数据要求,这些往往是基于物理的降雨-径流建模的严重制约因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
pure and applied geophysics
pure and applied geophysics 地学-地球化学与地球物理
CiteScore
4.20
自引率
5.00%
发文量
240
审稿时长
9.8 months
期刊介绍: pure and applied geophysics (pageoph), a continuation of the journal "Geofisica pura e applicata", publishes original scientific contributions in the fields of solid Earth, atmospheric and oceanic sciences. Regular and special issues feature thought-provoking reports on active areas of current research and state-of-the-art surveys. Long running journal, founded in 1939 as Geofisica pura e applicata Publishes peer-reviewed original scientific contributions and state-of-the-art surveys in solid earth and atmospheric sciences Features thought-provoking reports on active areas of current research and is a major source for publications on tsunami research Coverage extends to research topics in oceanic sciences See Instructions for Authors on the right hand side.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信