使用 Apriori 和 LSTM 模型进行数据驱动的污染源识别和水质预测:汉江流域案例研究

IF 3.5 3区 环境科学与生态学 Q2 ENVIRONMENTAL SCIENCES
Mingyang Liu, Jiake Li, Yafang Li, Weijie Gao, Jingkun Lu
{"title":"使用 Apriori 和 LSTM 模型进行数据驱动的污染源识别和水质预测:汉江流域案例研究","authors":"Mingyang Liu,&nbsp;Jiake Li,&nbsp;Yafang Li,&nbsp;Weijie Gao,&nbsp;Jingkun Lu","doi":"10.1016/j.jconhyd.2025.104570","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid development of urbanization and industrialization has exacerbated surface water pollution, especially from point sources such as industrial discharge and urban wastewater, posing a severe challenge to global environmental health and sustainable development. This study combines the Apriori algorithm and Long Short-Term Memory (LSTM) networks to identify major pollution sources and predict dynamic changes in water quality. The study area encompasses four national monitoring hydrological stations in the core area of the South-to-North Water Diversion Project, with multi-source data collected, including water quality parameters and industry-specific discharge data. Using the Apriori algorithm, the pollutants with the highest support—chemical oxygen demand (COD), copper (Cu), suspended solids (SS), and zinc (Zn)—demonstrated a support value of 0.87, indicating that the metallurgical, electroplating, and chemical industries are the primary pollution sources. Further association rule analysis based on varying parameter thresholds revealed that when COD is present, the co-occurrence confidence for Cadmium (Cd), Cu, Lead (Pb), and SS reaches 0.9, and the combination of COD, Cu, Pb, SS, and Cyanide (CN) achieves a confidence level of 1, indicating a high degree of correlation among these pollutants. The LSTM model demonstrated high accuracy in water quality prediction, with Root Mean Square Error (RMSE) values for COD predictions at each hydrological station ranging from 0.2076 to 0.3366, and coefficients of determination (R<sup>2</sup>) all exceeding 0.9, highlighting the model's stability and predictive accuracy. This study provides a scientific basis for the sustainable management of watershed water resources and serves as a significant reference for environmental policymaking and water resource protection.</div></div>","PeriodicalId":15530,"journal":{"name":"Journal of contaminant hydrology","volume":"272 ","pages":"Article 104570"},"PeriodicalIF":3.5000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-driven identification of pollution sources and water quality prediction using Apriori and LSTM models: A case study in the Hanjiang River basin\",\"authors\":\"Mingyang Liu,&nbsp;Jiake Li,&nbsp;Yafang Li,&nbsp;Weijie Gao,&nbsp;Jingkun Lu\",\"doi\":\"10.1016/j.jconhyd.2025.104570\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid development of urbanization and industrialization has exacerbated surface water pollution, especially from point sources such as industrial discharge and urban wastewater, posing a severe challenge to global environmental health and sustainable development. This study combines the Apriori algorithm and Long Short-Term Memory (LSTM) networks to identify major pollution sources and predict dynamic changes in water quality. The study area encompasses four national monitoring hydrological stations in the core area of the South-to-North Water Diversion Project, with multi-source data collected, including water quality parameters and industry-specific discharge data. Using the Apriori algorithm, the pollutants with the highest support—chemical oxygen demand (COD), copper (Cu), suspended solids (SS), and zinc (Zn)—demonstrated a support value of 0.87, indicating that the metallurgical, electroplating, and chemical industries are the primary pollution sources. Further association rule analysis based on varying parameter thresholds revealed that when COD is present, the co-occurrence confidence for Cadmium (Cd), Cu, Lead (Pb), and SS reaches 0.9, and the combination of COD, Cu, Pb, SS, and Cyanide (CN) achieves a confidence level of 1, indicating a high degree of correlation among these pollutants. The LSTM model demonstrated high accuracy in water quality prediction, with Root Mean Square Error (RMSE) values for COD predictions at each hydrological station ranging from 0.2076 to 0.3366, and coefficients of determination (R<sup>2</sup>) all exceeding 0.9, highlighting the model's stability and predictive accuracy. This study provides a scientific basis for the sustainable management of watershed water resources and serves as a significant reference for environmental policymaking and water resource protection.</div></div>\",\"PeriodicalId\":15530,\"journal\":{\"name\":\"Journal of contaminant hydrology\",\"volume\":\"272 \",\"pages\":\"Article 104570\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of contaminant hydrology\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169772225000750\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of contaminant hydrology","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169772225000750","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

城市化和工业化的快速发展加剧了地表水污染,特别是工业排放和城市废水等点源污染,对全球环境健康和可持续发展提出了严峻挑战。本研究将Apriori算法与LSTM (Long - Short-Term Memory)网络相结合,识别主要污染源并预测水质的动态变化。研究区域包括南水北调核心区的4个国家监测水文站,收集了多源数据,包括水质参数和行业特定排放数据。利用Apriori算法,支持度最高的污染物为化学需氧量(COD)、铜(Cu)、悬浮固体(SS)和锌(Zn),支持度为0.87,表明冶金、电镀和化工行业是主要污染源。进一步基于不同参数阈值的关联规则分析表明,当COD存在时,镉(Cd)、铜(Cu)、铅(Pb)和SS的共现置信度达到0.9,COD、Cu、Pb、SS和氰化物(CN)的共现置信度达到1,表明这些污染物之间具有高度的相关性。LSTM模型对水质的预测精度较高,各水文站COD预测的均方根误差(RMSE)在0.2076 ~ 0.3366之间,决定系数(R2)均超过0.9,表明模型的稳定性和预测精度较高。该研究为流域水资源可持续管理提供了科学依据,对环境决策和水资源保护具有重要参考意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data-driven identification of pollution sources and water quality prediction using Apriori and LSTM models: A case study in the Hanjiang River basin
The rapid development of urbanization and industrialization has exacerbated surface water pollution, especially from point sources such as industrial discharge and urban wastewater, posing a severe challenge to global environmental health and sustainable development. This study combines the Apriori algorithm and Long Short-Term Memory (LSTM) networks to identify major pollution sources and predict dynamic changes in water quality. The study area encompasses four national monitoring hydrological stations in the core area of the South-to-North Water Diversion Project, with multi-source data collected, including water quality parameters and industry-specific discharge data. Using the Apriori algorithm, the pollutants with the highest support—chemical oxygen demand (COD), copper (Cu), suspended solids (SS), and zinc (Zn)—demonstrated a support value of 0.87, indicating that the metallurgical, electroplating, and chemical industries are the primary pollution sources. Further association rule analysis based on varying parameter thresholds revealed that when COD is present, the co-occurrence confidence for Cadmium (Cd), Cu, Lead (Pb), and SS reaches 0.9, and the combination of COD, Cu, Pb, SS, and Cyanide (CN) achieves a confidence level of 1, indicating a high degree of correlation among these pollutants. The LSTM model demonstrated high accuracy in water quality prediction, with Root Mean Square Error (RMSE) values for COD predictions at each hydrological station ranging from 0.2076 to 0.3366, and coefficients of determination (R2) all exceeding 0.9, highlighting the model's stability and predictive accuracy. This study provides a scientific basis for the sustainable management of watershed water resources and serves as a significant reference for environmental policymaking and water resource protection.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of contaminant hydrology
Journal of contaminant hydrology 环境科学-地球科学综合
CiteScore
6.80
自引率
2.80%
发文量
129
审稿时长
68 days
期刊介绍: The Journal of Contaminant Hydrology is an international journal publishing scientific articles pertaining to the contamination of subsurface water resources. Emphasis is placed on investigations of the physical, chemical, and biological processes influencing the behavior and fate of organic and inorganic contaminants in the unsaturated (vadose) and saturated (groundwater) zones, as well as at groundwater-surface water interfaces. The ecological impacts of contaminants transported both from and to aquifers are of interest. Articles on contamination of surface water only, without a link to groundwater, are out of the scope. Broad latitude is allowed in identifying contaminants of interest, and include legacy and emerging pollutants, nutrients, nanoparticles, pathogenic microorganisms (e.g., bacteria, viruses, protozoa), microplastics, and various constituents associated with energy production (e.g., methane, carbon dioxide, hydrogen sulfide). The journal''s scope embraces a wide range of topics including: experimental investigations of contaminant sorption, diffusion, transformation, volatilization and transport in the surface and subsurface; characterization of soil and aquifer properties only as they influence contaminant behavior; development and testing of mathematical models of contaminant behaviour; innovative techniques for restoration of contaminated sites; development of new tools or techniques for monitoring the extent of soil and groundwater contamination; transformation of contaminants in the hyporheic zone; effects of contaminants traversing the hyporheic zone on surface water and groundwater ecosystems; subsurface carbon sequestration and/or turnover; and migration of fluids associated with energy production into groundwater.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信