Multi-step ahead probabilistic runoff forecasting with SHAP interpretability: a GPR-enhanced deep learning ensemble approach integrating teleconnection factors

IF 4.6 2区 环境科学与生态学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Feilin Zhu , Tiantian Hou , Ou Zhu , Yitong Sun , Weifeng Liu , Lingqi Zhao , Xuning Guo , Min Li , Ping-an Zhong
{"title":"Multi-step ahead probabilistic runoff forecasting with SHAP interpretability: a GPR-enhanced deep learning ensemble approach integrating teleconnection factors","authors":"Feilin Zhu ,&nbsp;Tiantian Hou ,&nbsp;Ou Zhu ,&nbsp;Yitong Sun ,&nbsp;Weifeng Liu ,&nbsp;Lingqi Zhao ,&nbsp;Xuning Guo ,&nbsp;Min Li ,&nbsp;Ping-an Zhong","doi":"10.1016/j.envsoft.2025.106647","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate medium and long-term runoff forecasting is of paramount importance for scientific water reservoir scheduling, mitigating flood and drought disasters, and promoting water resource planning and management. To enhance forecasting accuracy in river basins, this study introduces an integrated framework for probabilistic forecasting based on a multi-deep learning model ensemble with interpretable analysis. Initially, a multi-round iterative selection method identifies pivotal predictors from 130 climate circulation indices, with SHAP (SHapley Additive exPlanations) analysis revealing ENSO indices as dominant hydrological controls. Recognizing the limitations of single deep learning models in capturing runoff nonlinearity, an enhanced Bidirectional Long Short-Term Memory (BiLSTM) architecture is developed from the LSTM foundation. Subsequently, Convolutional Neural Networks (CNNs) and Attention mechanisms are progressively integrated, where the dominance of ENSO indices enables targeted extraction of high-impact climate signals, substantially improving prediction robustness. The screened teleconnection factors and runoff series serve as inputs to the CNN-BiLSTM-Attention ensemble model, generating deterministic runoff forecasts for 1–12 months ahead. Gaussian Process Regression (GPR) quantifies prediction uncertainty to produce interval probabilistic forecasts, while SHAP deciphers key driving factors, demonstrating that ENSO contributions are central to reducing prediction errors through interpretable feature attribution. Evaluated via comprehensive deterministic and probabilistic metrics in China's Yalong River Basin, the ensemble model achieves superior accuracy with highly reliable probabilistic intervals. Critically, the interpretable linkage between ENSO dominance and model performance validates that climate-informed deep learning synthesizes physical insights with data-driven advantages. This synergy—spanning dynamic factor screening, hybrid architecture design, uncertainty quantification, and explainable AI—provides actionable insights for climate-resilient flood control, water allocation, and ecosystem management.</div></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"193 ","pages":"Article 106647"},"PeriodicalIF":4.6000,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364815225003317","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate medium and long-term runoff forecasting is of paramount importance for scientific water reservoir scheduling, mitigating flood and drought disasters, and promoting water resource planning and management. To enhance forecasting accuracy in river basins, this study introduces an integrated framework for probabilistic forecasting based on a multi-deep learning model ensemble with interpretable analysis. Initially, a multi-round iterative selection method identifies pivotal predictors from 130 climate circulation indices, with SHAP (SHapley Additive exPlanations) analysis revealing ENSO indices as dominant hydrological controls. Recognizing the limitations of single deep learning models in capturing runoff nonlinearity, an enhanced Bidirectional Long Short-Term Memory (BiLSTM) architecture is developed from the LSTM foundation. Subsequently, Convolutional Neural Networks (CNNs) and Attention mechanisms are progressively integrated, where the dominance of ENSO indices enables targeted extraction of high-impact climate signals, substantially improving prediction robustness. The screened teleconnection factors and runoff series serve as inputs to the CNN-BiLSTM-Attention ensemble model, generating deterministic runoff forecasts for 1–12 months ahead. Gaussian Process Regression (GPR) quantifies prediction uncertainty to produce interval probabilistic forecasts, while SHAP deciphers key driving factors, demonstrating that ENSO contributions are central to reducing prediction errors through interpretable feature attribution. Evaluated via comprehensive deterministic and probabilistic metrics in China's Yalong River Basin, the ensemble model achieves superior accuracy with highly reliable probabilistic intervals. Critically, the interpretable linkage between ENSO dominance and model performance validates that climate-informed deep learning synthesizes physical insights with data-driven advantages. This synergy—spanning dynamic factor screening, hybrid architecture design, uncertainty quantification, and explainable AI—provides actionable insights for climate-resilient flood control, water allocation, and ecosystem management.

Abstract Image

具有SHAP可解释性的多步提前概率径流预测:一种gpr增强的深度学习集成方法,集成遥相关因素
准确的中长期径流预报对科学调度水库、减轻水旱灾害、促进水资源规划管理具有重要意义。为了提高流域的预测精度,本研究提出了一种基于多深度学习模型集成和可解释分析的概率预测集成框架。最初,采用多轮迭代选择方法从130个气候环流指数中确定关键预测因子,SHAP (SHapley Additive exPlanations)分析显示ENSO指数是主要的水文控制因素。认识到单一深度学习模型在捕获径流非线性方面的局限性,在LSTM基础上开发了一种增强的双向长短期记忆(BiLSTM)架构。随后,卷积神经网络(cnn)和注意力机制逐渐整合,其中ENSO指数的主导地位使得有针对性地提取高影响气候信号,大大提高了预测的鲁棒性。筛选的遥相关因子和径流序列作为CNN-BiLSTM-Attention集合模型的输入,生成未来1-12个月的确定性径流预测。高斯过程回归(GPR)量化预测不确定性以产生区间概率预测,而SHAP破译关键驱动因素,表明ENSO贡献是通过可解释的特征归因减少预测误差的核心。通过对中国雅砻江流域的确定性和概率综合指标进行评价,该模型具有较高的精度和可靠的概率区间。重要的是,ENSO优势和模型性能之间的可解释联系验证了气候信息深度学习将物理洞察力与数据驱动优势相结合。这种跨协同的动态因素筛选、混合建筑设计、不确定性量化和可解释的人工智能为气候适应型洪水控制、水资源分配和生态系统管理提供了可操作的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Environmental Modelling & Software
Environmental Modelling & Software 工程技术-工程:环境
CiteScore
9.30
自引率
8.20%
发文量
241
审稿时长
60 days
期刊介绍: Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信