Comparing quantile regression forest and mixture density long short-term memory models for probabilistic post-processing of satellite precipitation-driven streamflow simulations

IF 5.8 1区地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY

Hydrology and Earth System Sciences Pub Date : 2023-12-20 DOI:10.5194/hess-27-4529-2023

Yuhang Zhang, Aizhong Ye, B. Analui, P. Nguyen, S. Sorooshian, K. Hsu, Yuxuan Wang

{"title":"Comparing quantile regression forest and mixture density long short-term memory models for probabilistic post-processing of satellite precipitation-driven streamflow simulations","authors":"Yuhang Zhang, Aizhong Ye, B. Analui, P. Nguyen, S. Sorooshian, K. Hsu, Yuxuan Wang","doi":"10.5194/hess-27-4529-2023","DOIUrl":null,"url":null,"abstract":"Abstract. Deep learning (DL) and machine learning (ML) are widely used in hydrological modelling, which plays a critical role in improving the accuracy of hydrological predictions. However, the trade-off between model performance and computational cost has always been a challenge for hydrologists when selecting a suitable model, particularly for probabilistic post-processing with large ensemble members. This study aims to systematically compare the quantile regression forest (QRF) model and countable mixtures of asymmetric Laplacians long short-term memory (CMAL-LSTM) model as hydrological probabilistic post-processors. Specifically, we evaluate their ability in dealing with biased streamflow simulations driven by three satellite precipitation products across 522 nested sub-basins of the Yalong River basin in China. Model performance is comprehensively assessed using a series of scoring metrics from both probabilistic and deterministic perspectives. Our results show that the QRF model and the CMAL-LSTM model are comparable in terms of probabilistic prediction, and their performances are closely related to the flow accumulation area (FAA) of the sub-basin. The QRF model outperforms the CMAL-LSTM model in most sub-basins with smaller FAA, while the CMAL-LSTM model has an undebatable advantage in sub-basins with FAA larger than 60 000 km2 in the Yalong River basin. In terms of deterministic predictions, the CMAL-LSTM model is preferred, especially when the raw streamflow is poorly simulated and used as input. However, setting aside the differences in model performance, the QRF model with 100-member quantiles demonstrates a noteworthy advantage by exhibiting a 50 % reduction in computation time compared to the CMAL-LSTM model with the same ensemble members in all experiments. As a result, this study provides insights into model selection in hydrological post-processing and the trade-offs between model performance and computational efficiency. The findings highlight the importance of considering the specific application scenario, such as the catchment size and the required accuracy level, when selecting a suitable model for hydrological post-processing.","PeriodicalId":13143,"journal":{"name":"Hydrology and Earth System Sciences","volume":"112 ","pages":""},"PeriodicalIF":5.8000,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hydrology and Earth System Sciences","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.5194/hess-27-4529-2023","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract. Deep learning (DL) and machine learning (ML) are widely used in hydrological modelling, which plays a critical role in improving the accuracy of hydrological predictions. However, the trade-off between model performance and computational cost has always been a challenge for hydrologists when selecting a suitable model, particularly for probabilistic post-processing with large ensemble members. This study aims to systematically compare the quantile regression forest (QRF) model and countable mixtures of asymmetric Laplacians long short-term memory (CMAL-LSTM) model as hydrological probabilistic post-processors. Specifically, we evaluate their ability in dealing with biased streamflow simulations driven by three satellite precipitation products across 522 nested sub-basins of the Yalong River basin in China. Model performance is comprehensively assessed using a series of scoring metrics from both probabilistic and deterministic perspectives. Our results show that the QRF model and the CMAL-LSTM model are comparable in terms of probabilistic prediction, and their performances are closely related to the flow accumulation area (FAA) of the sub-basin. The QRF model outperforms the CMAL-LSTM model in most sub-basins with smaller FAA, while the CMAL-LSTM model has an undebatable advantage in sub-basins with FAA larger than 60 000 km2 in the Yalong River basin. In terms of deterministic predictions, the CMAL-LSTM model is preferred, especially when the raw streamflow is poorly simulated and used as input. However, setting aside the differences in model performance, the QRF model with 100-member quantiles demonstrates a noteworthy advantage by exhibiting a 50 % reduction in computation time compared to the CMAL-LSTM model with the same ensemble members in all experiments. As a result, this study provides insights into model selection in hydrological post-processing and the trade-offs between model performance and computational efficiency. The findings highlight the importance of considering the specific application scenario, such as the catchment size and the required accuracy level, when selecting a suitable model for hydrological post-processing.

查看原文本刊更多论文

比较量子回归森林模型和混合密度长期短期记忆模型，用于卫星降水驱动的河水模拟的概率后处理

摘要深度学习（DL）和机器学习（ML）被广泛应用于水文建模，在提高水文预测精度方面发挥着至关重要的作用。然而，在选择合适的模型时，模型性能与计算成本之间的权衡一直是水文学家面临的挑战，尤其是在使用大量集合成员进行概率后处理时。本研究旨在系统地比较作为水文概率后处理器的量化回归森林（QRF）模型和非对称拉普拉斯长短期记忆可计数混合物（CMAL-LSTM）模型。具体而言，我们评估了它们在处理中国雅砻江流域 522 个嵌套子流域中由三种卫星降水产品驱动的有偏差的流量模拟时的能力。我们从概率和确定性两个角度，使用一系列评分指标对模型性能进行了全面评估。结果表明，QRF 模型和 CMAL-LSTM 模型在概率预测方面不相上下，其性能与子流域的流量积聚面积（FAA）密切相关。在大多数积流面积较小的子流域中，QRF 模型的预测结果优于 CMAL-LSTM，而在雅砻江流域积流面积大于 60 000 km2 的子流域中，CMAL-LSTM 模型的预测结果具有无可争议的优势。在确定性预测方面，CMAL-LSTM 模型更胜一筹，尤其是在原始河流模拟不佳并用作输入时。然而，抛开模型性能的差异不谈，在所有实验中，与具有相同集合成员的 CMAL-LSTM 模型相比，具有 100 个集合成员的 QRF 模型显示出值得注意的优势，计算时间缩短了 50%。因此，本研究为水文后处理中的模型选择以及模型性能与计算效率之间的权衡提供了启示。研究结果突出表明，在为水文后处理选择合适的模型时，考虑具体应用场景（如流域面积和所需精度水平）非常重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Hydrology and Earth System Sciences 地学-地球科学综合

CiteScore

10.10

自引率

7.90%

发文量

273

审稿时长

15 months

期刊介绍： Hydrology and Earth System Sciences (HESS) is a not-for-profit international two-stage open-access journal for the publication of original research in hydrology. HESS encourages and supports fundamental and applied research that advances the understanding of hydrological systems, their role in providing water for ecosystems and society, and the role of the water cycle in the functioning of the Earth system. A multi-disciplinary approach is encouraged that broadens the hydrological perspective and the advancement of hydrological science through integration with other cognate sciences and cross-fertilization across disciplinary boundaries.