逆向代谢组学指南——大数据发现策略框架。

IF 13.1 1区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Nature Protocols Pub Date : 2025-02-28 DOI:10.1038/s41596-024-01136-2

Vincent Charron-Lamoureux, Helena Mannochio-Russo, Santosh Lamichhane, Shipei Xing, Abubaker Patan, Paulo Wender Portal Gomes, Prajit Rajkumar, Victoria Deleray, Andrés Mauricio Caraballo-Rodríguez, Kee Voon Chua, Lye Siang Lee, Zhao Liu, Jianhong Ching, Mingxun Wang, Pieter C Dorrestein

{"title":"逆向代谢组学指南——大数据发现策略框架。","authors":"Vincent Charron-Lamoureux, Helena Mannochio-Russo, Santosh Lamichhane, Shipei Xing, Abubaker Patan, Paulo Wender Portal Gomes, Prajit Rajkumar, Victoria Deleray, Andrés Mauricio Caraballo-Rodríguez, Kee Voon Chua, Lye Siang Lee, Zhao Liu, Jianhong Ching, Mingxun Wang, Pieter C Dorrestein","doi":"10.1038/s41596-024-01136-2","DOIUrl":null,"url":null,"abstract":"Untargeted metabolomics is evolving into a field of big data science. There is a growing interest within the metabolomics community in mining tandem mass spectrometry (MS/MS)-based data from public repositories. In traditional untargeted metabolomics, samples to address a predefined question are collected and liquid chromatography with MS/MS data are generated. We then identify metabolites associated with a phenotype (for example, disease versus healthy) and elucidate or validate their structural details (for example, molecular formula, structural classification, substructure or complete structural annotation or identification). In reverse metabolomics, we start with MS/MS spectra for known or unknown molecules. These spectra are used as search terms to search public data repositories to discover phenotype-relevant information such as organ/biofluid distribution, disease condition, intervention status (for example, pre- and postintervention), organisms (for example, mammals versus others), geography and any other biologically relevant associations. Here we guide the reader through a four-part process: (1) obtaining the MS/MS spectra of interest (Universal Spectrum Identifier) and (2) Mass Spectrometry Search Tool searches to find the files associated with the MS/MS that are in available databases, (3) using the Reanalysis Data User Interface framework to link the files with their metadata and (4) validating the observations. Parts 1-3 could take from hours to days depending on the method used for collecting MS/MS spectra. For example, we use MS/MS spectra from three small molecules: phenylalanine-cholic acid (a microbially conjugated bile acid), phenylalanine-C4:0 and histidine-C4:0 (two N-acyl amides). We leverage the Global Natural Products Social Molecular Networking-based framework to explore the microbial producers of these molecules and their associations with health conditions and organ distributions in humans and rodents.","PeriodicalId":18901,"journal":{"name":"Nature Protocols","volume":" ","pages":""},"PeriodicalIF":13.1000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A guide to reverse metabolomics-a framework for big data discovery strategy.\",\"authors\":\"Vincent Charron-Lamoureux, Helena Mannochio-Russo, Santosh Lamichhane, Shipei Xing, Abubaker Patan, Paulo Wender Portal Gomes, Prajit Rajkumar, Victoria Deleray, Andrés Mauricio Caraballo-Rodríguez, Kee Voon Chua, Lye Siang Lee, Zhao Liu, Jianhong Ching, Mingxun Wang, Pieter C Dorrestein\",\"doi\":\"10.1038/s41596-024-01136-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Untargeted metabolomics is evolving into a field of big data science. There is a growing interest within the metabolomics community in mining tandem mass spectrometry (MS/MS)-based data from public repositories. In traditional untargeted metabolomics, samples to address a predefined question are collected and liquid chromatography with MS/MS data are generated. We then identify metabolites associated with a phenotype (for example, disease versus healthy) and elucidate or validate their structural details (for example, molecular formula, structural classification, substructure or complete structural annotation or identification). In reverse metabolomics, we start with MS/MS spectra for known or unknown molecules. These spectra are used as search terms to search public data repositories to discover phenotype-relevant information such as organ/biofluid distribution, disease condition, intervention status (for example, pre- and postintervention), organisms (for example, mammals versus others), geography and any other biologically relevant associations. Here we guide the reader through a four-part process: (1) obtaining the MS/MS spectra of interest (Universal Spectrum Identifier) and (2) Mass Spectrometry Search Tool searches to find the files associated with the MS/MS that are in available databases, (3) using the Reanalysis Data User Interface framework to link the files with their metadata and (4) validating the observations. Parts 1-3 could take from hours to days depending on the method used for collecting MS/MS spectra. For example, we use MS/MS spectra from three small molecules: phenylalanine-cholic acid (a microbially conjugated bile acid), phenylalanine-C4:0 and histidine-C4:0 (two N-acyl amides). We leverage the Global Natural Products Social Molecular Networking-based framework to explore the microbial producers of these molecules and their associations with health conditions and organ distributions in humans and rodents.\",\"PeriodicalId\":18901,\"journal\":{\"name\":\"Nature Protocols\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":13.1000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Protocols\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1038/s41596-024-01136-2\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Protocols","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41596-024-01136-2","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

非靶向代谢组学正在发展成为一个大数据科学领域。代谢组学界对从公共资料库中挖掘基于串联质谱（MS/MS）的数据越来越感兴趣。在传统的非靶向代谢组学中，为解决一个预定义的问题而收集样本，并生成液相色谱和 MS/MS 数据。然后，我们确定与表型（如疾病与健康）相关的代谢物，并阐明或验证其结构细节（如分子式、结构分类、亚结构或完整的结构注释或鉴定）。在逆向代谢组学中，我们从已知或未知分子的 MS/MS 图谱开始。这些光谱可作为搜索条件，用于搜索公共数据存储库，以发现与表型相关的信息，如器官/生物流体分布、疾病状况、干预状态（如干预前和干预后）、生物体（如哺乳动物与其他生物体）、地理位置以及任何其他生物相关关联。在此，我们将指导读者完成一个由四个部分组成的过程：(1) 获取感兴趣的 MS/MS 图谱（通用图谱标识符）；(2) 质谱搜索工具搜索，以找到可用数据库中与 MS/MS 相关的文件；(3) 使用再分析数据用户界面框架将文件与其元数据联系起来；(4) 验证观察结果。根据收集 MS/MS 图谱所使用的方法，1-3 部分可能需要几小时到几天的时间。例如，我们使用了三种小分子的 MS/MS 图谱：苯丙氨酸-胆酸（一种微生物共轭胆酸）、苯丙氨酸-C4:0 和组氨酸-C4:0（两种 N-酰基酰胺）。我们利用基于全球天然产品社会分子网络的框架来探索这些分子的微生物生产者及其与人类和啮齿动物的健康状况和器官分布的关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A guide to reverse metabolomics-a framework for big data discovery strategy.

Untargeted metabolomics is evolving into a field of big data science. There is a growing interest within the metabolomics community in mining tandem mass spectrometry (MS/MS)-based data from public repositories. In traditional untargeted metabolomics, samples to address a predefined question are collected and liquid chromatography with MS/MS data are generated. We then identify metabolites associated with a phenotype (for example, disease versus healthy) and elucidate or validate their structural details (for example, molecular formula, structural classification, substructure or complete structural annotation or identification). In reverse metabolomics, we start with MS/MS spectra for known or unknown molecules. These spectra are used as search terms to search public data repositories to discover phenotype-relevant information such as organ/biofluid distribution, disease condition, intervention status (for example, pre- and postintervention), organisms (for example, mammals versus others), geography and any other biologically relevant associations. Here we guide the reader through a four-part process: (1) obtaining the MS/MS spectra of interest (Universal Spectrum Identifier) and (2) Mass Spectrometry Search Tool searches to find the files associated with the MS/MS that are in available databases, (3) using the Reanalysis Data User Interface framework to link the files with their metadata and (4) validating the observations. Parts 1-3 could take from hours to days depending on the method used for collecting MS/MS spectra. For example, we use MS/MS spectra from three small molecules: phenylalanine-cholic acid (a microbially conjugated bile acid), phenylalanine-C4:0 and histidine-C4:0 (two N-acyl amides). We leverage the Global Natural Products Social Molecular Networking-based framework to explore the microbial producers of these molecules and their associations with health conditions and organ distributions in humans and rodents.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nature Protocols 生物-生化研究方法

CiteScore

29.10

自引率

0.70%

发文量

128

审稿时长

4 months

期刊介绍： Nature Protocols focuses on publishing protocols used to address significant biological and biomedical science research questions, including methods grounded in physics and chemistry with practical applications to biological problems. The journal caters to a primary audience of research scientists and, as such, exclusively publishes protocols with research applications. Protocols primarily aimed at influencing patient management and treatment decisions are not featured. The specific techniques covered encompass a wide range, including but not limited to: Biochemistry, Cell biology, Cell culture, Chemical modification, Computational biology, Developmental biology, Epigenomics, Genetic analysis, Genetic modification, Genomics, Imaging, Immunology, Isolation, purification, and separation, Lipidomics, Metabolomics, Microbiology, Model organisms, Nanotechnology, Neuroscience, Nucleic-acid-based molecular biology, Pharmacology, Plant biology, Protein analysis, Proteomics, Spectroscopy, Structural biology, Synthetic chemistry, Tissue culture, Toxicology, and Virology.