Yongchun Liang , Fangyu Ding , Lei Liu , Fang Yin , Mengmeng Hao , Tingting Kang , Chuanpeng Zhao , Ziteng Wang , Dong Jiang
{"title":"Monitoring water quality parameters in urban rivers using multi-source data and machine learning approach","authors":"Yongchun Liang , Fangyu Ding , Lei Liu , Fang Yin , Mengmeng Hao , Tingting Kang , Chuanpeng Zhao , Ziteng Wang , Dong Jiang","doi":"10.1016/j.jhydrol.2024.132394","DOIUrl":null,"url":null,"abstract":"<div><div>The systematic surveillance of nutrients and organic pollution in urban rivers is crucial for enhancing ecological integrity and promoting societal and economic sustainability. Currently, the primary methods of water quality monitoring involve on-site sampling and laboratory analysis, which are constrained by various factors such as terrain and climate. Remote sensing water quality monitoring, which enables large-scale, periodic, and comprehensive coverage, serves as an important supplement to these traditional methods. However, most current research on water quality monitoring predominantly relies on remote sensing technology, often overlooking the application of other multi-source data. In this study, we examined rivers in the Weihe River Basin by integrating field samples, Sentinel-2 multispectral imagery, meteorological elements, and land use types to construct machine learning (ML) models for predicting four water quality parameters (WQPs): ammonia nitrogen (NH<sub>3</sub>-N), total phosphorus (TP), chemical oxygen demand (COD), and dissolved oxygen (DO). The results showed that land use types significantly influenced the accuracy of predictions for NH<sub>3</sub>-N, TP, COD, and DO. Among the models evaluated, the Extra Tree Regression (ETR), eXtreme Gradient Boosting (XGBoost), and Gradient Boosting Regression (GBR) demonstrated the highest accuracy and transferability for monitoring WQPs in rivers. For instance, the models achieved the following coefficients of determination (R<sup>2</sup>) in 5-fold cross-validation: for NH<sub>3</sub>-N, R<sup>2</sup> was 0.65 in both the testing and validation datasets; for TP, R<sup>2</sup> was 0.71 and 0.68; for COD, R<sup>2</sup> was 0.50 and 0.47; and for DO, R<sup>2</sup> was 0.68 and 0.64, respectively. Therefore, our findings underscore the feasibility of using multi-source data and ML methods to quantify water pollutants in urban rivers, providing essential technical support for monitoring the spatiotemporal dynamics of river water quality across extensive geographical areas.</div></div>","PeriodicalId":362,"journal":{"name":"Journal of Hydrology","volume":"648 ","pages":"Article 132394"},"PeriodicalIF":5.9000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydrology","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022169424017906","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
The systematic surveillance of nutrients and organic pollution in urban rivers is crucial for enhancing ecological integrity and promoting societal and economic sustainability. Currently, the primary methods of water quality monitoring involve on-site sampling and laboratory analysis, which are constrained by various factors such as terrain and climate. Remote sensing water quality monitoring, which enables large-scale, periodic, and comprehensive coverage, serves as an important supplement to these traditional methods. However, most current research on water quality monitoring predominantly relies on remote sensing technology, often overlooking the application of other multi-source data. In this study, we examined rivers in the Weihe River Basin by integrating field samples, Sentinel-2 multispectral imagery, meteorological elements, and land use types to construct machine learning (ML) models for predicting four water quality parameters (WQPs): ammonia nitrogen (NH3-N), total phosphorus (TP), chemical oxygen demand (COD), and dissolved oxygen (DO). The results showed that land use types significantly influenced the accuracy of predictions for NH3-N, TP, COD, and DO. Among the models evaluated, the Extra Tree Regression (ETR), eXtreme Gradient Boosting (XGBoost), and Gradient Boosting Regression (GBR) demonstrated the highest accuracy and transferability for monitoring WQPs in rivers. For instance, the models achieved the following coefficients of determination (R2) in 5-fold cross-validation: for NH3-N, R2 was 0.65 in both the testing and validation datasets; for TP, R2 was 0.71 and 0.68; for COD, R2 was 0.50 and 0.47; and for DO, R2 was 0.68 and 0.64, respectively. Therefore, our findings underscore the feasibility of using multi-source data and ML methods to quantify water pollutants in urban rivers, providing essential technical support for monitoring the spatiotemporal dynamics of river water quality across extensive geographical areas.
期刊介绍:
The Journal of Hydrology publishes original research papers and comprehensive reviews in all the subfields of the hydrological sciences including water based management and policy issues that impact on economics and society. These comprise, but are not limited to the physical, chemical, biogeochemical, stochastic and systems aspects of surface and groundwater hydrology, hydrometeorology and hydrogeology. Relevant topics incorporating the insights and methodologies of disciplines such as climatology, water resource systems, hydraulics, agrohydrology, geomorphology, soil science, instrumentation and remote sensing, civil and environmental engineering are included. Social science perspectives on hydrological problems such as resource and ecological economics, environmental sociology, psychology and behavioural science, management and policy analysis are also invited. Multi-and interdisciplinary analyses of hydrological problems are within scope. The science published in the Journal of Hydrology is relevant to catchment scales rather than exclusively to a local scale or site.