Feilin Zhu , Tiantian Hou , Ou Zhu , Yitong Sun , Weifeng Liu , Lingqi Zhao , Xuning Guo , Min Li , Ping-an Zhong
{"title":"Multi-step ahead probabilistic runoff forecasting with SHAP interpretability: a GPR-enhanced deep learning ensemble approach integrating teleconnection factors","authors":"Feilin Zhu , Tiantian Hou , Ou Zhu , Yitong Sun , Weifeng Liu , Lingqi Zhao , Xuning Guo , Min Li , Ping-an Zhong","doi":"10.1016/j.envsoft.2025.106647","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate medium and long-term runoff forecasting is of paramount importance for scientific water reservoir scheduling, mitigating flood and drought disasters, and promoting water resource planning and management. To enhance forecasting accuracy in river basins, this study introduces an integrated framework for probabilistic forecasting based on a multi-deep learning model ensemble with interpretable analysis. Initially, a multi-round iterative selection method identifies pivotal predictors from 130 climate circulation indices, with SHAP (SHapley Additive exPlanations) analysis revealing ENSO indices as dominant hydrological controls. Recognizing the limitations of single deep learning models in capturing runoff nonlinearity, an enhanced Bidirectional Long Short-Term Memory (BiLSTM) architecture is developed from the LSTM foundation. Subsequently, Convolutional Neural Networks (CNNs) and Attention mechanisms are progressively integrated, where the dominance of ENSO indices enables targeted extraction of high-impact climate signals, substantially improving prediction robustness. The screened teleconnection factors and runoff series serve as inputs to the CNN-BiLSTM-Attention ensemble model, generating deterministic runoff forecasts for 1–12 months ahead. Gaussian Process Regression (GPR) quantifies prediction uncertainty to produce interval probabilistic forecasts, while SHAP deciphers key driving factors, demonstrating that ENSO contributions are central to reducing prediction errors through interpretable feature attribution. Evaluated via comprehensive deterministic and probabilistic metrics in China's Yalong River Basin, the ensemble model achieves superior accuracy with highly reliable probabilistic intervals. Critically, the interpretable linkage between ENSO dominance and model performance validates that climate-informed deep learning synthesizes physical insights with data-driven advantages. This synergy—spanning dynamic factor screening, hybrid architecture design, uncertainty quantification, and explainable AI—provides actionable insights for climate-resilient flood control, water allocation, and ecosystem management.</div></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"193 ","pages":"Article 106647"},"PeriodicalIF":4.6000,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364815225003317","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate medium and long-term runoff forecasting is of paramount importance for scientific water reservoir scheduling, mitigating flood and drought disasters, and promoting water resource planning and management. To enhance forecasting accuracy in river basins, this study introduces an integrated framework for probabilistic forecasting based on a multi-deep learning model ensemble with interpretable analysis. Initially, a multi-round iterative selection method identifies pivotal predictors from 130 climate circulation indices, with SHAP (SHapley Additive exPlanations) analysis revealing ENSO indices as dominant hydrological controls. Recognizing the limitations of single deep learning models in capturing runoff nonlinearity, an enhanced Bidirectional Long Short-Term Memory (BiLSTM) architecture is developed from the LSTM foundation. Subsequently, Convolutional Neural Networks (CNNs) and Attention mechanisms are progressively integrated, where the dominance of ENSO indices enables targeted extraction of high-impact climate signals, substantially improving prediction robustness. The screened teleconnection factors and runoff series serve as inputs to the CNN-BiLSTM-Attention ensemble model, generating deterministic runoff forecasts for 1–12 months ahead. Gaussian Process Regression (GPR) quantifies prediction uncertainty to produce interval probabilistic forecasts, while SHAP deciphers key driving factors, demonstrating that ENSO contributions are central to reducing prediction errors through interpretable feature attribution. Evaluated via comprehensive deterministic and probabilistic metrics in China's Yalong River Basin, the ensemble model achieves superior accuracy with highly reliable probabilistic intervals. Critically, the interpretable linkage between ENSO dominance and model performance validates that climate-informed deep learning synthesizes physical insights with data-driven advantages. This synergy—spanning dynamic factor screening, hybrid architecture design, uncertainty quantification, and explainable AI—provides actionable insights for climate-resilient flood control, water allocation, and ecosystem management.
期刊介绍:
Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.