Using unsupervised machine learning and positive matrix factorization models to drive groundwater chemistry and associated health risks in a coal − mining rural region

IF 5.9 1区 地球科学 Q1 ENGINEERING, CIVIL
Yuting Yan , Yunhui Zhang , Zhanxue Sun , Zhan Xie , Rongwen Yao , Si Chen , Md Galal Uddin , Yujun Pu , Chang Yang , Ying Wang , Yangshuang Wang
{"title":"Using unsupervised machine learning and positive matrix factorization models to drive groundwater chemistry and associated health risks in a coal − mining rural region","authors":"Yuting Yan ,&nbsp;Yunhui Zhang ,&nbsp;Zhanxue Sun ,&nbsp;Zhan Xie ,&nbsp;Rongwen Yao ,&nbsp;Si Chen ,&nbsp;Md Galal Uddin ,&nbsp;Yujun Pu ,&nbsp;Chang Yang ,&nbsp;Ying Wang ,&nbsp;Yangshuang Wang","doi":"10.1016/j.jhydrol.2025.133691","DOIUrl":null,"url":null,"abstract":"<div><div>Identifying and quantifying the geogenic sources and anthropogenic sources of heavy metals and nitrate from groundwater is essential for securing the groundwater environment in mining rural areas. However, the integrated approaches for clarifying groundwater chemistry, specific pollutant sources, and associated probabilistic health risks in the mining rural area have yet to be raised. In this study, unsupervised machine learning, compositional data analysis based on principal component analysis, positive matrix factorization, and Monte-Carlo simulating health risks were used to quantify pollutant sources and groundwater drinking suitability in a coal − mining region of northeastern Chongqing, SW China. Three groups of groundwater samples were recognized by unsupervised machine learning. Group A was Ca − HCO<sub>3</sub> type. Group B was dominated by Ca − SO<sub>4</sub> and mixed Ca − Na − HCO<sub>3</sub> types. Group C consisted of Ca − HCO<sub>3</sub> type and Ca − SO<sub>4</sub> types. Group A and Group C were controlled by carbonate rocks and silicate dissolution, while Group B was dominated by the dissolution of silicate rocks, pyrite and oxides of heavy metals. Positive cation exchange was identified in all types of groundwater samples. Agricultural activity and mining sewage discharge was the primary sources of nitrate contamination. Compositional data analysis (CoDa) based on principal component analysis (PCA) and positive matrix factorization (PMF) model identified three primary hydrochemical processes and five factors for all hydrochemical components, respectively. CoDa-PCA corroborated the former analysis of hydrochemical diagram, mineral saturation index. According to the PMF analysis for all hydrochemical components, natural background levels (NBLs) and the PMF (five factors) for heavy metals indicated the concentrations of Fe and Mn originated from the dissolution of Fe and Mn oxides in red beds (25.91 %). The concentrations of Co, Ni, Ba, Zn, Cu, and Hg were derived from the dissolution of oxides (22.35 %), barite (17.87 %), sphalerite (17.85 %), chalcopyrite and cinnabar (16.02 %). The combined weighted water quality index (CWQI) and heavy metal pollution index (HPI) values of all groundwater samples satisfied the drinking permission limit, revealing the groundwater was suitable for drinking purposes in the study area. The hazard index (HI) values depicted that there was approximately a 6.01 % probability of groundwater posed health risks above the acceptable limit (&gt;1) to children. The most sensitive factors to human health risks were exposure frequency to contaminated water and NO<sub>3</sub><sup>−</sup> concentration. Our study is expected to provide a reliable and robust basis for groundwater sustainable management in mining rural regions.</div></div>","PeriodicalId":362,"journal":{"name":"Journal of Hydrology","volume":"661 ","pages":"Article 133691"},"PeriodicalIF":5.9000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydrology","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022169425010297","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0

Abstract

Identifying and quantifying the geogenic sources and anthropogenic sources of heavy metals and nitrate from groundwater is essential for securing the groundwater environment in mining rural areas. However, the integrated approaches for clarifying groundwater chemistry, specific pollutant sources, and associated probabilistic health risks in the mining rural area have yet to be raised. In this study, unsupervised machine learning, compositional data analysis based on principal component analysis, positive matrix factorization, and Monte-Carlo simulating health risks were used to quantify pollutant sources and groundwater drinking suitability in a coal − mining region of northeastern Chongqing, SW China. Three groups of groundwater samples were recognized by unsupervised machine learning. Group A was Ca − HCO3 type. Group B was dominated by Ca − SO4 and mixed Ca − Na − HCO3 types. Group C consisted of Ca − HCO3 type and Ca − SO4 types. Group A and Group C were controlled by carbonate rocks and silicate dissolution, while Group B was dominated by the dissolution of silicate rocks, pyrite and oxides of heavy metals. Positive cation exchange was identified in all types of groundwater samples. Agricultural activity and mining sewage discharge was the primary sources of nitrate contamination. Compositional data analysis (CoDa) based on principal component analysis (PCA) and positive matrix factorization (PMF) model identified three primary hydrochemical processes and five factors for all hydrochemical components, respectively. CoDa-PCA corroborated the former analysis of hydrochemical diagram, mineral saturation index. According to the PMF analysis for all hydrochemical components, natural background levels (NBLs) and the PMF (five factors) for heavy metals indicated the concentrations of Fe and Mn originated from the dissolution of Fe and Mn oxides in red beds (25.91 %). The concentrations of Co, Ni, Ba, Zn, Cu, and Hg were derived from the dissolution of oxides (22.35 %), barite (17.87 %), sphalerite (17.85 %), chalcopyrite and cinnabar (16.02 %). The combined weighted water quality index (CWQI) and heavy metal pollution index (HPI) values of all groundwater samples satisfied the drinking permission limit, revealing the groundwater was suitable for drinking purposes in the study area. The hazard index (HI) values depicted that there was approximately a 6.01 % probability of groundwater posed health risks above the acceptable limit (>1) to children. The most sensitive factors to human health risks were exposure frequency to contaminated water and NO3 concentration. Our study is expected to provide a reliable and robust basis for groundwater sustainable management in mining rural regions.

Abstract Image

使用无监督机器学习和正矩阵分解模型来驱动煤矿农村地区地下水化学和相关健康风险
确定和量化地下水中重金属和硝酸盐的地质来源和人为来源对保障农村矿区地下水环境至关重要。然而,澄清矿区农村地下水化学、特定污染源和相关概率健康风险的综合方法尚未提出。本研究采用无监督机器学习、基于主成分分析的成分数据分析、正矩阵分解和蒙特卡罗模拟健康风险等方法,对重庆东北某矿区的污染源和地下水饮用适宜性进行了量化。通过无监督机器学习识别了三组地下水样本。A组为Ca−HCO3型。B组以Ca - SO4和Ca - Na - HCO3混合型为主。C组由Ca−HCO3型和Ca−SO4型组成。A组和C组以碳酸盐岩和硅酸盐溶蚀作用为主,B组以硅酸盐、黄铁矿和重金属氧化物溶蚀作用为主。所有类型的地下水样品均存在正离子交换。农业活动和采矿污水排放是硝酸盐污染的主要来源。基于主成分分析(PCA)和正矩阵分解(PMF)模型的成分数据分析(CoDa)分别确定了3个主要水化学过程和5个影响所有水化学成分的因子。CoDa-PCA证实了前人对水化学图、矿物饱和度指数的分析。根据各水化学成分的PMF分析、自然背景水平(NBLs)和重金属的PMF(5因子)表明,铁和锰的浓度来源于红层中铁和锰氧化物的溶解(25.91%)。Co、Ni、Ba、Zn、Cu、Hg的含量来源于氧化物(22.35%)、重晶石(17.87%)、闪锌矿(17.85%)、黄铜矿和朱砂(16.02%)的溶解。所有地下水样本加权水质指数(CWQI)和重金属污染指数(HPI)的组合值均满足饮用许可限量,表明研究区地下水适合饮用。危害指数(HI)值表明,地下水对儿童构成健康风险的概率约为6.01%,超过可接受限度(>1)。对人类健康风险最敏感的因子是水体暴露频率和NO3−浓度。本研究可为矿区农村地下水可持续管理提供可靠、有力的依据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Hydrology
Journal of Hydrology 地学-地球科学综合
CiteScore
11.00
自引率
12.50%
发文量
1309
审稿时长
7.5 months
期刊介绍: The Journal of Hydrology publishes original research papers and comprehensive reviews in all the subfields of the hydrological sciences including water based management and policy issues that impact on economics and society. These comprise, but are not limited to the physical, chemical, biogeochemical, stochastic and systems aspects of surface and groundwater hydrology, hydrometeorology and hydrogeology. Relevant topics incorporating the insights and methodologies of disciplines such as climatology, water resource systems, hydraulics, agrohydrology, geomorphology, soil science, instrumentation and remote sensing, civil and environmental engineering are included. Social science perspectives on hydrological problems such as resource and ecological economics, environmental sociology, psychology and behavioural science, management and policy analysis are also invited. Multi-and interdisciplinary analyses of hydrological problems are within scope. The science published in the Journal of Hydrology is relevant to catchment scales rather than exclusively to a local scale or site.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信