Tarik Bouramtane , Ismail Mohsine , Nourelhouda Karmouda , Marc Leblanc , Yannick Estève , Ilias Kacimi , Mohamed Hilali , Salima Mdhaffar , Sarah Tweed , Mounia Tahiri , Nadia Kassou , Ali El Bilali , Omar Chafki
{"title":"Dimensionality reduction for groundwater forecasting under drought and intensive irrigation with neural networks","authors":"Tarik Bouramtane , Ismail Mohsine , Nourelhouda Karmouda , Marc Leblanc , Yannick Estève , Ilias Kacimi , Mohamed Hilali , Salima Mdhaffar , Sarah Tweed , Mounia Tahiri , Nadia Kassou , Ali El Bilali , Omar Chafki","doi":"10.1016/j.ejrh.2025.102477","DOIUrl":null,"url":null,"abstract":"<div><div>Study Region: This study focuses on the Berrechid aquifer system in northern Morocco.</div><div>Study Focus: The research explores Principal Component Analysis (PCA) for optimizing input selection in groundwater level forecasting using neural networks. PCA efficiently reduces input dimensionality while preserving critical information, making it beneficial for neural network modelling of natural systems with extensive input variables in a low-resource scenarios requiring feature engineering. A Long Short-Term Memory (LSTM) model predicted groundwater levels in six monitoring bores using four hydro-climatic variables, precipitation, land surface temperature (LST), actual evapotranspiration (AET), and the normalized difference vegetation index (NDVI). Model performance was compared using two approaches: the LSTM-XGB model with the best-selected input features and the LSTM-PC1 model based on the first principal component (PC1). New Hydrological Insights for the Region: Results showed that NDVI, AET, and LST were the dominant inputs across different monitoring bores. On average, PC1 accounted for 68.3 % of the variance in hydro-climatic variables, with an eigenvalue of 2.75, surpassing the combined variance of two individual hydro-climatic variables. Both models performed effectively, achieving R² values of 0.982–0.999 during training and 0.885–0.999 during validation. The models successfully captured groundwater fluctuations and the declining trend during drought. LSTM-XGB slightly outperformed LSTM-PC1 in certain cases, but the differences were minimal. The use of PC1 not only mitigates overfitting risks but also allows for generalized predictions across multiple monitoring sites, making it a practical choice for large datasets.</div></div>","PeriodicalId":48620,"journal":{"name":"Journal of Hydrology-Regional Studies","volume":"60 ","pages":"Article 102477"},"PeriodicalIF":4.7000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydrology-Regional Studies","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214581825003027","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"WATER RESOURCES","Score":null,"Total":0}
引用次数: 0
Abstract
Study Region: This study focuses on the Berrechid aquifer system in northern Morocco.
Study Focus: The research explores Principal Component Analysis (PCA) for optimizing input selection in groundwater level forecasting using neural networks. PCA efficiently reduces input dimensionality while preserving critical information, making it beneficial for neural network modelling of natural systems with extensive input variables in a low-resource scenarios requiring feature engineering. A Long Short-Term Memory (LSTM) model predicted groundwater levels in six monitoring bores using four hydro-climatic variables, precipitation, land surface temperature (LST), actual evapotranspiration (AET), and the normalized difference vegetation index (NDVI). Model performance was compared using two approaches: the LSTM-XGB model with the best-selected input features and the LSTM-PC1 model based on the first principal component (PC1). New Hydrological Insights for the Region: Results showed that NDVI, AET, and LST were the dominant inputs across different monitoring bores. On average, PC1 accounted for 68.3 % of the variance in hydro-climatic variables, with an eigenvalue of 2.75, surpassing the combined variance of two individual hydro-climatic variables. Both models performed effectively, achieving R² values of 0.982–0.999 during training and 0.885–0.999 during validation. The models successfully captured groundwater fluctuations and the declining trend during drought. LSTM-XGB slightly outperformed LSTM-PC1 in certain cases, but the differences were minimal. The use of PC1 not only mitigates overfitting risks but also allows for generalized predictions across multiple monitoring sites, making it a practical choice for large datasets.
期刊介绍:
Journal of Hydrology: Regional Studies publishes original research papers enhancing the science of hydrology and aiming at region-specific problems, past and future conditions, analysis, review and solutions. The journal particularly welcomes research papers that deliver new insights into region-specific hydrological processes and responses to changing conditions, as well as contributions that incorporate interdisciplinarity and translational science.