Shiwei Yang, Ruifeng Liang, Junguang Chen, Yuanming Wang, Kefeng Li
{"title":"基于可解释的机器学习模型估算水质指数。","authors":"Shiwei Yang, Ruifeng Liang, Junguang Chen, Yuanming Wang, Kefeng Li","doi":"10.2166/wst.2024.068","DOIUrl":null,"url":null,"abstract":"<p><p>The water quality index (WQI) is an important tool for evaluating the water quality status of lakes. In this study, we used the WQI to evaluate the spatial water quality characteristics of Dianchi Lake. However, the WQI calculation is time-consuming, and machine learning models exhibit significant advantages in terms of timeliness and nonlinear data fitting. We used a machine learning model with optimized parameters to predict the WQI, and the light gradient boosting machine achieved good predictive performance. The machine learning model trained based on the entire Dianchi Lake water quality data achieved coefficient of determination (R<sup>2</sup>), mean square error, and mean absolute error values of 0.989, 0.228, and 0.298, respectively. In addition, we used the Shapley additive explanations (SHAP) method to interpret and analyse the machine learning model and identified the main water quality parameter that affects the WQI of Dianchi Lake as NH<sub>4</sub><sup>+</sup>-N. Within the entire range of Dianchi Lake, the SHAP values of NH<sub>4</sub><sup>+</sup>-N varied from -9 to 3. Thus, in future water environmental governance, it is necessary to focus on NH<sub>4</sub><sup>+</sup>-N changes. These results can provide a reference for the treatment of lake water environments.</p>","PeriodicalId":23653,"journal":{"name":"Water Science and Technology","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/wst_2024_068/pdf/","citationCount":"0","resultStr":"{\"title\":\"Estimating the water quality index based on interpretable machine learning models.\",\"authors\":\"Shiwei Yang, Ruifeng Liang, Junguang Chen, Yuanming Wang, Kefeng Li\",\"doi\":\"10.2166/wst.2024.068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The water quality index (WQI) is an important tool for evaluating the water quality status of lakes. In this study, we used the WQI to evaluate the spatial water quality characteristics of Dianchi Lake. However, the WQI calculation is time-consuming, and machine learning models exhibit significant advantages in terms of timeliness and nonlinear data fitting. We used a machine learning model with optimized parameters to predict the WQI, and the light gradient boosting machine achieved good predictive performance. The machine learning model trained based on the entire Dianchi Lake water quality data achieved coefficient of determination (R<sup>2</sup>), mean square error, and mean absolute error values of 0.989, 0.228, and 0.298, respectively. In addition, we used the Shapley additive explanations (SHAP) method to interpret and analyse the machine learning model and identified the main water quality parameter that affects the WQI of Dianchi Lake as NH<sub>4</sub><sup>+</sup>-N. Within the entire range of Dianchi Lake, the SHAP values of NH<sub>4</sub><sup>+</sup>-N varied from -9 to 3. Thus, in future water environmental governance, it is necessary to focus on NH<sub>4</sub><sup>+</sup>-N changes. These results can provide a reference for the treatment of lake water environments.</p>\",\"PeriodicalId\":23653,\"journal\":{\"name\":\"Water Science and Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/wst_2024_068/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Water Science and Technology\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.2166/wst.2024.068\",\"RegionNum\":4,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ENVIRONMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Science and Technology","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.2166/wst.2024.068","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
Estimating the water quality index based on interpretable machine learning models.
The water quality index (WQI) is an important tool for evaluating the water quality status of lakes. In this study, we used the WQI to evaluate the spatial water quality characteristics of Dianchi Lake. However, the WQI calculation is time-consuming, and machine learning models exhibit significant advantages in terms of timeliness and nonlinear data fitting. We used a machine learning model with optimized parameters to predict the WQI, and the light gradient boosting machine achieved good predictive performance. The machine learning model trained based on the entire Dianchi Lake water quality data achieved coefficient of determination (R2), mean square error, and mean absolute error values of 0.989, 0.228, and 0.298, respectively. In addition, we used the Shapley additive explanations (SHAP) method to interpret and analyse the machine learning model and identified the main water quality parameter that affects the WQI of Dianchi Lake as NH4+-N. Within the entire range of Dianchi Lake, the SHAP values of NH4+-N varied from -9 to 3. Thus, in future water environmental governance, it is necessary to focus on NH4+-N changes. These results can provide a reference for the treatment of lake water environments.
期刊介绍:
Water Science and Technology publishes peer-reviewed papers on all aspects of the science and technology of water and wastewater. Papers are selected by a rigorous peer review procedure with the aim of rapid and wide dissemination of research results, development and application of new techniques, and related managerial and policy issues. Scientists, engineers, consultants, managers and policy-makers will find this journal essential as a permanent record of progress of research activities and their practical applications.