Cheng Shi, Corey M. G. Carpenter, Damian E. Helbling and Gerrad D. Jones*,
{"title":"反卷积和解释非目标化学数据:数据驱动的法医工作流程,用于识别接收水中最突出的化学来源。","authors":"Cheng Shi, Corey M. G. Carpenter, Damian E. Helbling and Gerrad D. Jones*, ","doi":"10.1021/acs.est.5c07541","DOIUrl":null,"url":null,"abstract":"<p >Chemical forensics aims to identify major contamination sources, but existing workflows often rely on predefined targets and known sources, introducing bias. Here, we present a data-driven workflow that reduces this bias by applying an unsupervised machine learning technique. We applied both nonmetric multidimensional scaling (NMDS) and non-negative matrix factorization (NMF) on the same nontargeted chemical data set to compare their different interpretations of environmental sources. Weekly nontargeted data was collected from the Fall Creek Monitoring Station (Ithaca, NY), where daily samples were previously analyzed using source-defined models. NMF was first used to decompose the full nontargeted chemical data set into a small set of chemical factors representing distinct composition profiles. Each factor was then interpreted through (1) Spearman correlations with watershed characteristics (e.g., temperature, flow) and (2) suspect screening of high-weighted nontargeted features. In addition to confirming known anthropogenic inputs, our analysis revealed potential novel sources associated with snowmelt, groundwater seepage, and seasonal hydrological dynamics. We also detected an annual shift in the chemical composition, highlighting the evolving influence of these sources. This workflow enables watershed managers to move beyond predefined sources, detect both known and emerging chemical contributors, and apply adaptive, evidence-based strategies to protect water quality under changing conditions.</p>","PeriodicalId":36,"journal":{"name":"环境科学与技术","volume":"59 36","pages":"19307–19317"},"PeriodicalIF":11.3000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deconvoluting and Interpreting Nontargeted Chemical Data: A Data-Driven Forensic Workflow for Identifying the Most Prominent Chemical Sources in Receiving Waters\",\"authors\":\"Cheng Shi, Corey M. G. Carpenter, Damian E. Helbling and Gerrad D. Jones*, \",\"doi\":\"10.1021/acs.est.5c07541\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Chemical forensics aims to identify major contamination sources, but existing workflows often rely on predefined targets and known sources, introducing bias. Here, we present a data-driven workflow that reduces this bias by applying an unsupervised machine learning technique. We applied both nonmetric multidimensional scaling (NMDS) and non-negative matrix factorization (NMF) on the same nontargeted chemical data set to compare their different interpretations of environmental sources. Weekly nontargeted data was collected from the Fall Creek Monitoring Station (Ithaca, NY), where daily samples were previously analyzed using source-defined models. NMF was first used to decompose the full nontargeted chemical data set into a small set of chemical factors representing distinct composition profiles. Each factor was then interpreted through (1) Spearman correlations with watershed characteristics (e.g., temperature, flow) and (2) suspect screening of high-weighted nontargeted features. In addition to confirming known anthropogenic inputs, our analysis revealed potential novel sources associated with snowmelt, groundwater seepage, and seasonal hydrological dynamics. We also detected an annual shift in the chemical composition, highlighting the evolving influence of these sources. This workflow enables watershed managers to move beyond predefined sources, detect both known and emerging chemical contributors, and apply adaptive, evidence-based strategies to protect water quality under changing conditions.</p>\",\"PeriodicalId\":36,\"journal\":{\"name\":\"环境科学与技术\",\"volume\":\"59 36\",\"pages\":\"19307–19317\"},\"PeriodicalIF\":11.3000,\"publicationDate\":\"2025-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"环境科学与技术\",\"FirstCategoryId\":\"1\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.est.5c07541\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ENVIRONMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"环境科学与技术","FirstCategoryId":"1","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.est.5c07541","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
Deconvoluting and Interpreting Nontargeted Chemical Data: A Data-Driven Forensic Workflow for Identifying the Most Prominent Chemical Sources in Receiving Waters
Chemical forensics aims to identify major contamination sources, but existing workflows often rely on predefined targets and known sources, introducing bias. Here, we present a data-driven workflow that reduces this bias by applying an unsupervised machine learning technique. We applied both nonmetric multidimensional scaling (NMDS) and non-negative matrix factorization (NMF) on the same nontargeted chemical data set to compare their different interpretations of environmental sources. Weekly nontargeted data was collected from the Fall Creek Monitoring Station (Ithaca, NY), where daily samples were previously analyzed using source-defined models. NMF was first used to decompose the full nontargeted chemical data set into a small set of chemical factors representing distinct composition profiles. Each factor was then interpreted through (1) Spearman correlations with watershed characteristics (e.g., temperature, flow) and (2) suspect screening of high-weighted nontargeted features. In addition to confirming known anthropogenic inputs, our analysis revealed potential novel sources associated with snowmelt, groundwater seepage, and seasonal hydrological dynamics. We also detected an annual shift in the chemical composition, highlighting the evolving influence of these sources. This workflow enables watershed managers to move beyond predefined sources, detect both known and emerging chemical contributors, and apply adaptive, evidence-based strategies to protect water quality under changing conditions.
期刊介绍:
Environmental Science & Technology (ES&T) is a co-sponsored academic and technical magazine by the Hubei Provincial Environmental Protection Bureau and the Hubei Provincial Academy of Environmental Sciences.
Environmental Science & Technology (ES&T) holds the status of Chinese core journals, scientific papers source journals of China, Chinese Science Citation Database source journals, and Chinese Academic Journal Comprehensive Evaluation Database source journals. This publication focuses on the academic field of environmental protection, featuring articles related to environmental protection and technical advancements.