{"title":"A hybrid signal decomposition-machine learning benchmarking framework for multi-station precipitation prediction in the Kébir Rhumel basin (Algeria)","authors":"Aykut Erol, Issam Rehamnia, Hatice Citakoglu","doi":"10.1007/s11600-026-01888-3","DOIUrl":null,"url":null,"abstract":"<div><p>Reliable multi-station precipitation forecasting is challenging due to nonstationarity, noise, and spatial heterogeneity. This paper introduces a hybrid signal decomposition-machine learning benchmarking framework that integrates four decomposition methods (TQWT, MODWT, EWT, VMD) with three learners (Bagging, LSBoost, KNN), yielding twelve hybrid models. These models were rigorously tested across twelve stations in the Kébir Rhumel Basin using eight statistical metrics and distributional diagnostics to assess accuracy, stability, and generalization. Two dominant families emerged: TQWT-based hybrids achieved localized accuracy at four stations, while MODWT-Bagging led at eight stations and delivered the most consistent cross-station performance. MODWT-Bagging achieved <i>R</i><sup>2</sup> = 0.984–0.993 and NSE = 0.981–0.993, with RMSE ranging from 2.64 to 6.34, demonstrating strong predictive skill under varying hydro-climatic conditions. In noise-rich environments, it substantially reduced errors; for example, at El Milia, RMSE dropped from 12.57 (VMD-LSBoost) to 6.03, a ≈ 52% reduction, and improvements of up to 63% were observed at other stations. Its superiority stems from MODWT’s shift-invariance and noise robustness combined with Bagging’s variance reduction. Taylor diagrams and violin plots confirmed centered, compact error structures, while scatter plots verified accurate phase and magnitude tracking. By clarifying how decomposition structure and learner characteristics interact across heterogeneous regimes, this framework fills a key gap in signal decomposition-machine learning model selection. The findings support adaptive hybrid design for early warning, water resource management, and precipitation-driven forecasting systems. Overall, MODWT-Bagging is established as a robust default for complex precipitation modeling, and the proposed framework provides a scalable foundation for next-generation hybrid predictive tools.</p></div>","PeriodicalId":6988,"journal":{"name":"Acta Geophysica","volume":"74 3","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11600-026-01888-3.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Geophysica","FirstCategoryId":"89","ListUrlMain":"https://link.springer.com/article/10.1007/s11600-026-01888-3","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Reliable multi-station precipitation forecasting is challenging due to nonstationarity, noise, and spatial heterogeneity. This paper introduces a hybrid signal decomposition-machine learning benchmarking framework that integrates four decomposition methods (TQWT, MODWT, EWT, VMD) with three learners (Bagging, LSBoost, KNN), yielding twelve hybrid models. These models were rigorously tested across twelve stations in the Kébir Rhumel Basin using eight statistical metrics and distributional diagnostics to assess accuracy, stability, and generalization. Two dominant families emerged: TQWT-based hybrids achieved localized accuracy at four stations, while MODWT-Bagging led at eight stations and delivered the most consistent cross-station performance. MODWT-Bagging achieved R2 = 0.984–0.993 and NSE = 0.981–0.993, with RMSE ranging from 2.64 to 6.34, demonstrating strong predictive skill under varying hydro-climatic conditions. In noise-rich environments, it substantially reduced errors; for example, at El Milia, RMSE dropped from 12.57 (VMD-LSBoost) to 6.03, a ≈ 52% reduction, and improvements of up to 63% were observed at other stations. Its superiority stems from MODWT’s shift-invariance and noise robustness combined with Bagging’s variance reduction. Taylor diagrams and violin plots confirmed centered, compact error structures, while scatter plots verified accurate phase and magnitude tracking. By clarifying how decomposition structure and learner characteristics interact across heterogeneous regimes, this framework fills a key gap in signal decomposition-machine learning model selection. The findings support adaptive hybrid design for early warning, water resource management, and precipitation-driven forecasting systems. Overall, MODWT-Bagging is established as a robust default for complex precipitation modeling, and the proposed framework provides a scalable foundation for next-generation hybrid predictive tools.
期刊介绍:
Acta Geophysica is open to all kinds of manuscripts including research and review articles, short communications, comments to published papers, letters to the Editor as well as book reviews. Some of the issues are fully devoted to particular topics; we do encourage proposals for such topical issues. We accept submissions from scientists world-wide, offering high scientific and editorial standard and comprehensive treatment of the discussed topics.