{"title":"Integrated machine learning-based optimization framework for surface water quality index comparing coastal and non-coastal cases of Guangxi, China","authors":"Xizhi Nong , Fengcheng He , Lihua Chen , Jiahua Wei","doi":"10.1016/j.marpolbul.2025.117564","DOIUrl":null,"url":null,"abstract":"<div><div>In this study, an optimized comprehensive water quality index (WQI) model framework is developed, which combines advanced machine learning technology to compare different types of surface water quality assessment. The proposed framework enhancement encompasses four critical methodological advancements, i.e., water quality parameter selection, parameter normalization, weighting determination, and WQI aggregation function comparison. The Random Forest (RF) machine learning algorithm ranks water quality parameters based on their relative importance in determining overall water quality regimes. The water quality parameter weightings were determined using the Rank Order Centroid (ROC) method. The parameter normalization was designed following national standards by transforming observation data into dimensionless values on a unified scale and comparing the sensitivity and prediction error of four distinct WQI models. Multiple Linear Regression (MLR) models were employed to assess the sensitivity and precision of the WQI model. A comparison case study was conducted in China's typical coastal and non-coastal regions, i.e., Guangxi Zhuang Autonomous Region (Guangxi), to verify the robustness and adaptability of WQI model performance. The results show that the overall water quality status in Guangxi was generally in “Good” or “Medium” level. There was significant spatial water quality heterogeneity in the river systems of Guangxi, and the non-coastal region showed better water quality, almost at a “Good” level compared to the coastal region. The weighted quadratic mean (WQM) and the unweighted root mean square (RMS) models were selected as the most suitable WQI models for water quality evaluation in coastal and non-coastal regions in Guangxi. The water quality in the coastal region was almost “Medium”, with the average WQIs of WQM and RMS models being 74.27 and 76.51, respectively. The average WQIs evaluated by WQM and RMS models in non-coastal region were 85.39 and 88.81, respectively. This study can provide a valuable and reliable scientific reference for future administrative bodies implementing effective water environment risk prevention and management measures.</div></div>","PeriodicalId":18215,"journal":{"name":"Marine pollution bulletin","volume":"213 ","pages":"Article 117564"},"PeriodicalIF":5.3000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Marine pollution bulletin","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0025326X25000396","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
In this study, an optimized comprehensive water quality index (WQI) model framework is developed, which combines advanced machine learning technology to compare different types of surface water quality assessment. The proposed framework enhancement encompasses four critical methodological advancements, i.e., water quality parameter selection, parameter normalization, weighting determination, and WQI aggregation function comparison. The Random Forest (RF) machine learning algorithm ranks water quality parameters based on their relative importance in determining overall water quality regimes. The water quality parameter weightings were determined using the Rank Order Centroid (ROC) method. The parameter normalization was designed following national standards by transforming observation data into dimensionless values on a unified scale and comparing the sensitivity and prediction error of four distinct WQI models. Multiple Linear Regression (MLR) models were employed to assess the sensitivity and precision of the WQI model. A comparison case study was conducted in China's typical coastal and non-coastal regions, i.e., Guangxi Zhuang Autonomous Region (Guangxi), to verify the robustness and adaptability of WQI model performance. The results show that the overall water quality status in Guangxi was generally in “Good” or “Medium” level. There was significant spatial water quality heterogeneity in the river systems of Guangxi, and the non-coastal region showed better water quality, almost at a “Good” level compared to the coastal region. The weighted quadratic mean (WQM) and the unweighted root mean square (RMS) models were selected as the most suitable WQI models for water quality evaluation in coastal and non-coastal regions in Guangxi. The water quality in the coastal region was almost “Medium”, with the average WQIs of WQM and RMS models being 74.27 and 76.51, respectively. The average WQIs evaluated by WQM and RMS models in non-coastal region were 85.39 and 88.81, respectively. This study can provide a valuable and reliable scientific reference for future administrative bodies implementing effective water environment risk prevention and management measures.
期刊介绍:
Marine Pollution Bulletin is concerned with the rational use of maritime and marine resources in estuaries, the seas and oceans, as well as with documenting marine pollution and introducing new forms of measurement and analysis. A wide range of topics are discussed as news, comment, reviews and research reports, not only on effluent disposal and pollution control, but also on the management, economic aspects and protection of the marine environment in general.