Shapelet-based decomposition stack machine learning model explains more middle river reaches water level hydrological process with high accuracy early warning
{"title":"Shapelet-based decomposition stack machine learning model explains more middle river reaches water level hydrological process with high accuracy early warning","authors":"Songhua Huan","doi":"10.1016/j.jhydrol.2025.133927","DOIUrl":null,"url":null,"abstract":"<div><div>Flooding remains one of the most devastating natural hazards worldwide, yet understanding the complex hydrological processes that lead to flooding poses a significant challenge, hindering effective prevention efforts. To address this issue, this study proposes a stacked machine learning framework that integrates the Offline Shapelet Discovery (OSD) technique. Hydrological time series data are first decomposed using Empirical Wavelet Transform (EWT), and OSD is applied to generate a pool of potential shapelets for training. These shapelets are then processed using a deep learning model to produce preliminary predictions. Finally, an ensemble machine learning approach integrates these sub-predictions to generate the final forecast. The model is evaluated in the Pearl River Basin, a representative watershed encompassing several major urban areas. Compared with traditional machine learning methods, the proposed model demonstrates superior predictive performance across six stations located in the upper, middle and lower reaches of the basin. In the upper reaches, the model achieves a mean absolute error (MAE) of 0.2265, mean square error (MSE) of 0.0723, root mean square error (RMSE) of 0.2679, mean absolute percentage error (MAPE) of 0.0038, percent bias (PBIAS) of 0.0034 and Nash-Sutcliffe efficiency (NSE) of 0.8103. In the lower reaches, the respective values are 0.1766, 0.0619, 0.2720, 0.0415, −0.0007 and 0.8739, while in the middle reaches, they are 0.1239, 0.0362, 0.1890, 0.0059, 0.0007 and 0.9228. The shapelet pool reveals distinctive water level patterns, notably “up-down-up-up” and “down-down-up-down” types across various river segments. This study contributes to a deeper understanding of complex hydrological behaviors and provides new insights for enhancing flood prediction and prevention strategies through innovative data decomposition and pattern recognition techniques.</div></div>","PeriodicalId":362,"journal":{"name":"Journal of Hydrology","volume":"662 ","pages":"Article 133927"},"PeriodicalIF":6.3000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydrology","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S002216942501265X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
Flooding remains one of the most devastating natural hazards worldwide, yet understanding the complex hydrological processes that lead to flooding poses a significant challenge, hindering effective prevention efforts. To address this issue, this study proposes a stacked machine learning framework that integrates the Offline Shapelet Discovery (OSD) technique. Hydrological time series data are first decomposed using Empirical Wavelet Transform (EWT), and OSD is applied to generate a pool of potential shapelets for training. These shapelets are then processed using a deep learning model to produce preliminary predictions. Finally, an ensemble machine learning approach integrates these sub-predictions to generate the final forecast. The model is evaluated in the Pearl River Basin, a representative watershed encompassing several major urban areas. Compared with traditional machine learning methods, the proposed model demonstrates superior predictive performance across six stations located in the upper, middle and lower reaches of the basin. In the upper reaches, the model achieves a mean absolute error (MAE) of 0.2265, mean square error (MSE) of 0.0723, root mean square error (RMSE) of 0.2679, mean absolute percentage error (MAPE) of 0.0038, percent bias (PBIAS) of 0.0034 and Nash-Sutcliffe efficiency (NSE) of 0.8103. In the lower reaches, the respective values are 0.1766, 0.0619, 0.2720, 0.0415, −0.0007 and 0.8739, while in the middle reaches, they are 0.1239, 0.0362, 0.1890, 0.0059, 0.0007 and 0.9228. The shapelet pool reveals distinctive water level patterns, notably “up-down-up-up” and “down-down-up-down” types across various river segments. This study contributes to a deeper understanding of complex hydrological behaviors and provides new insights for enhancing flood prediction and prevention strategies through innovative data decomposition and pattern recognition techniques.
期刊介绍:
The Journal of Hydrology publishes original research papers and comprehensive reviews in all the subfields of the hydrological sciences including water based management and policy issues that impact on economics and society. These comprise, but are not limited to the physical, chemical, biogeochemical, stochastic and systems aspects of surface and groundwater hydrology, hydrometeorology and hydrogeology. Relevant topics incorporating the insights and methodologies of disciplines such as climatology, water resource systems, hydraulics, agrohydrology, geomorphology, soil science, instrumentation and remote sensing, civil and environmental engineering are included. Social science perspectives on hydrological problems such as resource and ecological economics, environmental sociology, psychology and behavioural science, management and policy analysis are also invited. Multi-and interdisciplinary analyses of hydrological problems are within scope. The science published in the Journal of Hydrology is relevant to catchment scales rather than exclusively to a local scale or site.