Anjana Elapavalore, Dylan H Ross, Valentin Grouès, Dagny Aurich, Allison M Krinsky, Sunghwan Kim, Paul A Thiessen, Jian Zhang, James N Dodds, Erin S Baker, Evan E Bolton, Libin Xu, Emma L Schymanski
{"title":"PubChemLite Plus碰撞截面(CCS)值用于增强非目标环境数据的解释。","authors":"Anjana Elapavalore, Dylan H Ross, Valentin Grouès, Dagny Aurich, Allison M Krinsky, Sunghwan Kim, Paul A Thiessen, Jian Zhang, James N Dodds, Erin S Baker, Evan E Bolton, Libin Xu, Emma L Schymanski","doi":"10.1021/acs.estlett.4c01003","DOIUrl":null,"url":null,"abstract":"<p><p>Finding relevant chemicals in the vast (known) chemical space is a major challenge for environmental and exposomics studies leveraging nontarget high resolution mass spectrometry (NT-HRMS) methods. Chemical databases now contain hundreds of millions of chemicals, yet many are not relevant. This article details an extensive collaborative, open science effort to provide a dynamic collection of chemicals for environmental, metabolomics, and exposomics research, along with supporting information about their relevance to assist researchers in the interpretation of candidate hits. The PubChemLite for Exposomics collection is compiled from ten annotation categories within PubChem, enhanced with patent, literature and annotation counts, predicted partition coefficient (logP) values, as well as predicted collision cross section (CCS) values using CCSbase. Monthly versions are archived on Zenodo under a CC-BY license, supporting reproducible research, and a new interface has been developed, including historical trends of patent and literature data, for researchers to browse the collection. This article details how PubChemLite can support researchers in environmental and exposomics studies, describes efforts to increase the availability of experimental CCS values, and explores known limitations and potential for future developments. The data and code behind these efforts are openly available. PubChemLite can be browsed at https://pubchemlite.lcsb.uni.lu.</p>","PeriodicalId":37,"journal":{"name":"Environmental Science & Technology Letters Environ.","volume":"12 2","pages":"166-174"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823450/pdf/","citationCount":"0","resultStr":"{\"title\":\"PubChemLite Plus Collision Cross Section (CCS) Values for Enhanced Interpretation of Nontarget Environmental Data.\",\"authors\":\"Anjana Elapavalore, Dylan H Ross, Valentin Grouès, Dagny Aurich, Allison M Krinsky, Sunghwan Kim, Paul A Thiessen, Jian Zhang, James N Dodds, Erin S Baker, Evan E Bolton, Libin Xu, Emma L Schymanski\",\"doi\":\"10.1021/acs.estlett.4c01003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Finding relevant chemicals in the vast (known) chemical space is a major challenge for environmental and exposomics studies leveraging nontarget high resolution mass spectrometry (NT-HRMS) methods. Chemical databases now contain hundreds of millions of chemicals, yet many are not relevant. This article details an extensive collaborative, open science effort to provide a dynamic collection of chemicals for environmental, metabolomics, and exposomics research, along with supporting information about their relevance to assist researchers in the interpretation of candidate hits. The PubChemLite for Exposomics collection is compiled from ten annotation categories within PubChem, enhanced with patent, literature and annotation counts, predicted partition coefficient (logP) values, as well as predicted collision cross section (CCS) values using CCSbase. Monthly versions are archived on Zenodo under a CC-BY license, supporting reproducible research, and a new interface has been developed, including historical trends of patent and literature data, for researchers to browse the collection. This article details how PubChemLite can support researchers in environmental and exposomics studies, describes efforts to increase the availability of experimental CCS values, and explores known limitations and potential for future developments. The data and code behind these efforts are openly available. PubChemLite can be browsed at https://pubchemlite.lcsb.uni.lu.</p>\",\"PeriodicalId\":37,\"journal\":{\"name\":\"Environmental Science & Technology Letters Environ.\",\"volume\":\"12 2\",\"pages\":\"166-174\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823450/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Science & Technology Letters Environ.\",\"FirstCategoryId\":\"1\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.estlett.4c01003\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/11 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ENVIRONMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science & Technology Letters Environ.","FirstCategoryId":"1","ListUrlMain":"https://doi.org/10.1021/acs.estlett.4c01003","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/11 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
摘要
在广阔的(已知的)化学空间中寻找相关的化学物质是利用非目标高分辨率质谱(NT-HRMS)方法进行环境和暴露学研究的主要挑战。化学数据库现在包含数以亿计的化学物质,但许多是不相关的。本文详细介绍了一项广泛的合作,开放的科学努力,为环境,代谢组学和暴露组学研究提供化学物质的动态集合,以及有关其相关性的支持信息,以帮助研究人员解释候选hit。PubChemLite for Exposomics集合由PubChem中的十个注释类别编译而成,并通过专利、文献和注释计数、预测分区系数(logP)值以及使用CCSbase预测碰撞横截面(CCS)值进行增强。每月的版本存档在Zenodo的CC-BY许可下,支持可复制的研究,并开发了一个新的界面,包括专利和文献数据的历史趋势,供研究人员浏览集合。本文详细介绍了PubChemLite如何支持环境和暴露学研究的研究人员,描述了增加实验CCS值可用性的努力,并探讨了已知的局限性和未来发展的潜力。这些努力背后的数据和代码是公开可用的。PubChemLite可以在https://pubchemlite.lcsb.uni.lu上浏览。
PubChemLite Plus Collision Cross Section (CCS) Values for Enhanced Interpretation of Nontarget Environmental Data.
Finding relevant chemicals in the vast (known) chemical space is a major challenge for environmental and exposomics studies leveraging nontarget high resolution mass spectrometry (NT-HRMS) methods. Chemical databases now contain hundreds of millions of chemicals, yet many are not relevant. This article details an extensive collaborative, open science effort to provide a dynamic collection of chemicals for environmental, metabolomics, and exposomics research, along with supporting information about their relevance to assist researchers in the interpretation of candidate hits. The PubChemLite for Exposomics collection is compiled from ten annotation categories within PubChem, enhanced with patent, literature and annotation counts, predicted partition coefficient (logP) values, as well as predicted collision cross section (CCS) values using CCSbase. Monthly versions are archived on Zenodo under a CC-BY license, supporting reproducible research, and a new interface has been developed, including historical trends of patent and literature data, for researchers to browse the collection. This article details how PubChemLite can support researchers in environmental and exposomics studies, describes efforts to increase the availability of experimental CCS values, and explores known limitations and potential for future developments. The data and code behind these efforts are openly available. PubChemLite can be browsed at https://pubchemlite.lcsb.uni.lu.
期刊介绍:
Environmental Science & Technology Letters serves as an international forum for brief communications on experimental or theoretical results of exceptional timeliness in all aspects of environmental science, both pure and applied. Published as soon as accepted, these communications are summarized in monthly issues. Additionally, the journal features short reviews on emerging topics in environmental science and technology.