Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data

IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Pilleriin Peets, Aristeidis Litos, Kai Dührkop, Daniel R. Garza, Justin J. J. van der Hooft, Sebastian Böcker, Bas E. Dutilh
{"title":"Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data","authors":"Pilleriin Peets,&nbsp;Aristeidis Litos,&nbsp;Kai Dührkop,&nbsp;Daniel R. Garza,&nbsp;Justin J. J. van der Hooft,&nbsp;Sebastian Böcker,&nbsp;Bas E. Dutilh","doi":"10.1186/s13321-025-01031-2","DOIUrl":null,"url":null,"abstract":"<div><p>Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (&lt; 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate the chemical properties of compounds, allowing chemical annotation to be used for sample comparison. Thus, we identified compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes.</p><h3>Graphical Abstract</h3>\n<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01031-2","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-01031-2","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate the chemical properties of compounds, allowing chemical annotation to be used for sample comparison. Thus, we identified compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes.

Graphical Abstract

化学特征向量从非目标质谱数据映射自然生物群系的化学空间
非靶向代谢组学可以全面绘制生物群系的化学空间,但受标注率低(< 10%)的限制。我们使用化学特征向量,包括分子指纹或化学化合物类别,从质谱数据预测,表征化合物和样品。这些化学特征载体(ccv)估计样品中具有特定化学性质的化合物的比例。与具有强度信息的MS1数据不同,ccv包含化合物的化学性质,允许化学注释用于样品比较。因此,我们确定了区分生物群系的化合物类别,例如在环境生物群系中富集的醚类,而在动物宿主相关生物群系中富集的类固醇类。在变异性较大的生物群系中,ccv揭示了关键的聚类化合物类别,如动物远端肠道中的有机氮化合物和动物分泌物中的脂质。因此,ccv增强了对非靶向代谢组学数据的解释,为自然生物群系的化学空间提供了可量化和可推广的理解。化学特征向量允许样品与可解释的化学信息进行比较。通过利用分子指纹或化合物类别,ccv利用了“化学暗物质”,否则将被排除在外。这种方法增强了非靶向代谢组学数据的可解释性,揭示了跨生物群系的关键化学模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信