Gabriela May-Lagunes , Valerie Chau , Eric Ellestad , Leyla Greengard , Paolo D'Odorico , Puya Vahabi , Alberto Todeschini , Manuela Girotto
{"title":"使用机器学习方法预测地下水位:加州中央河谷案例","authors":"Gabriela May-Lagunes , Valerie Chau , Eric Ellestad , Leyla Greengard , Paolo D'Odorico , Puya Vahabi , Alberto Todeschini , Manuela Girotto","doi":"10.1016/j.hydroa.2023.100161","DOIUrl":null,"url":null,"abstract":"<div><p>Groundwater, the second largest stock of freshwater on the planet, is an important water source used for municipal water supply, irrigation, or industrial needs. For instance, California’s arid Central Valley relies on groundwater resources to produce a quarter of the United States’ food demand as farmers rely on this precious resource when surface water is scarce. Despite its importance, the nexus between groundwater dynamics and climate drivers remains difficult to quantify, model, and predict because of the lack of a comprehensive observation network. In this study, machine learning techniques were used to predict groundwater levels with a 3-month forecasting horizon for the Sacramento River Basin. For this, publicly available meteorological and hydrological datasets and in-situ well-level measurements were used. Time series, ensemble-based, and deep-learning models including transformers were all tested, with an ensemble-based, XGBoost model, producing the best mean standard deviation percent error (MSPE) of 32.23% and a root mean squared error (RMSE) of 1.05 m (m) when using a 3- month forecasting horizon and when tested using a monthly rolling window over the years 2017–2020. The model proved to be better at predicting into wet months than the dry summer months and was found to be better at extracting seasonality than explaining well-level residuals, with well-specific features, as opposed to exogenous meteorological features specific to the hydrological unit of the well, ranking as the most important features to the model. Though other forecasting horizons were tested, a 3-month look-ahead window resulted in the best balance of precision and accuracy, where smaller forecasting horizons resulted in smaller RMSE but larger MSPE scores and vice-versa for larger forecasting horizons.</p></div>","PeriodicalId":36948,"journal":{"name":"Journal of Hydrology X","volume":"21 ","pages":"Article 100161"},"PeriodicalIF":3.1000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589915523000147/pdfft?md5=aab140af4d0a28517df303e628b13bca&pid=1-s2.0-S2589915523000147-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley\",\"authors\":\"Gabriela May-Lagunes , Valerie Chau , Eric Ellestad , Leyla Greengard , Paolo D'Odorico , Puya Vahabi , Alberto Todeschini , Manuela Girotto\",\"doi\":\"10.1016/j.hydroa.2023.100161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Groundwater, the second largest stock of freshwater on the planet, is an important water source used for municipal water supply, irrigation, or industrial needs. For instance, California’s arid Central Valley relies on groundwater resources to produce a quarter of the United States’ food demand as farmers rely on this precious resource when surface water is scarce. Despite its importance, the nexus between groundwater dynamics and climate drivers remains difficult to quantify, model, and predict because of the lack of a comprehensive observation network. In this study, machine learning techniques were used to predict groundwater levels with a 3-month forecasting horizon for the Sacramento River Basin. For this, publicly available meteorological and hydrological datasets and in-situ well-level measurements were used. Time series, ensemble-based, and deep-learning models including transformers were all tested, with an ensemble-based, XGBoost model, producing the best mean standard deviation percent error (MSPE) of 32.23% and a root mean squared error (RMSE) of 1.05 m (m) when using a 3- month forecasting horizon and when tested using a monthly rolling window over the years 2017–2020. The model proved to be better at predicting into wet months than the dry summer months and was found to be better at extracting seasonality than explaining well-level residuals, with well-specific features, as opposed to exogenous meteorological features specific to the hydrological unit of the well, ranking as the most important features to the model. Though other forecasting horizons were tested, a 3-month look-ahead window resulted in the best balance of precision and accuracy, where smaller forecasting horizons resulted in smaller RMSE but larger MSPE scores and vice-versa for larger forecasting horizons.</p></div>\",\"PeriodicalId\":36948,\"journal\":{\"name\":\"Journal of Hydrology X\",\"volume\":\"21 \",\"pages\":\"Article 100161\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2589915523000147/pdfft?md5=aab140af4d0a28517df303e628b13bca&pid=1-s2.0-S2589915523000147-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Hydrology X\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589915523000147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydrology X","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589915523000147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley
Groundwater, the second largest stock of freshwater on the planet, is an important water source used for municipal water supply, irrigation, or industrial needs. For instance, California’s arid Central Valley relies on groundwater resources to produce a quarter of the United States’ food demand as farmers rely on this precious resource when surface water is scarce. Despite its importance, the nexus between groundwater dynamics and climate drivers remains difficult to quantify, model, and predict because of the lack of a comprehensive observation network. In this study, machine learning techniques were used to predict groundwater levels with a 3-month forecasting horizon for the Sacramento River Basin. For this, publicly available meteorological and hydrological datasets and in-situ well-level measurements were used. Time series, ensemble-based, and deep-learning models including transformers were all tested, with an ensemble-based, XGBoost model, producing the best mean standard deviation percent error (MSPE) of 32.23% and a root mean squared error (RMSE) of 1.05 m (m) when using a 3- month forecasting horizon and when tested using a monthly rolling window over the years 2017–2020. The model proved to be better at predicting into wet months than the dry summer months and was found to be better at extracting seasonality than explaining well-level residuals, with well-specific features, as opposed to exogenous meteorological features specific to the hydrological unit of the well, ranking as the most important features to the model. Though other forecasting horizons were tested, a 3-month look-ahead window resulted in the best balance of precision and accuracy, where smaller forecasting horizons resulted in smaller RMSE but larger MSPE scores and vice-versa for larger forecasting horizons.