开发用于代谢组学应用的二维核磁共振数据解读的机器学习方法。

IF 2.6 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Julie Pollak , Moses Mayonu , Lin Jiang , Bo Wang
{"title":"开发用于代谢组学应用的二维核磁共振数据解读的机器学习方法。","authors":"Julie Pollak ,&nbsp;Moses Mayonu ,&nbsp;Lin Jiang ,&nbsp;Bo Wang","doi":"10.1016/j.ab.2024.115654","DOIUrl":null,"url":null,"abstract":"<div><p>Metabolomics has been widely applied in human diseases and environmental science to study the systematic changes of metabolites over diverse types of stimuli. NMR-based metabolomics has been widely used, but the peak overlap problems in the one-dimensional (1D) NMR spectrum could limit the accuracy of quantitative analysis for metabolomics applications. Two-dimensional (2D) NMR has been applied to solve the 1D NMR overlap problem, but the data processing is still challenging. In this study, we built an automatic approach to process the 2D NMR data for quantitative applications using machine learning approaches. Partial least square discriminant analysis (PLS-DA), artificial neural network classification (ANN-DA), gradient boosted trees classification (XGBoost-DA), and artificial deep learning neural network classification (ANNDL-DA) were applied in combination with an automatic peak selection approach. Standard mixtures, sea anemone extracts, and mouse fecal samples were tested to demonstrate the approach. Our results showed that ANN-DA and ANNDL-DA have high accuracy in selecting 2D NMR peaks (around 90 %), which have a high potential application in 2D NMR-based metabolomics quantitively study, while PLS-DA and XGBoost-DA showed limitations in either data variation or overfitting. Our study built an automatic approach to applying 2D NMR data to routine quantitative analysis in metabolomics.</p></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"695 ","pages":"Article 115654"},"PeriodicalIF":2.6000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The development of machine learning approaches in two-dimensional NMR data interpretation for metabolomics applications\",\"authors\":\"Julie Pollak ,&nbsp;Moses Mayonu ,&nbsp;Lin Jiang ,&nbsp;Bo Wang\",\"doi\":\"10.1016/j.ab.2024.115654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Metabolomics has been widely applied in human diseases and environmental science to study the systematic changes of metabolites over diverse types of stimuli. NMR-based metabolomics has been widely used, but the peak overlap problems in the one-dimensional (1D) NMR spectrum could limit the accuracy of quantitative analysis for metabolomics applications. Two-dimensional (2D) NMR has been applied to solve the 1D NMR overlap problem, but the data processing is still challenging. In this study, we built an automatic approach to process the 2D NMR data for quantitative applications using machine learning approaches. Partial least square discriminant analysis (PLS-DA), artificial neural network classification (ANN-DA), gradient boosted trees classification (XGBoost-DA), and artificial deep learning neural network classification (ANNDL-DA) were applied in combination with an automatic peak selection approach. Standard mixtures, sea anemone extracts, and mouse fecal samples were tested to demonstrate the approach. Our results showed that ANN-DA and ANNDL-DA have high accuracy in selecting 2D NMR peaks (around 90 %), which have a high potential application in 2D NMR-based metabolomics quantitively study, while PLS-DA and XGBoost-DA showed limitations in either data variation or overfitting. Our study built an automatic approach to applying 2D NMR data to routine quantitative analysis in metabolomics.</p></div>\",\"PeriodicalId\":7830,\"journal\":{\"name\":\"Analytical biochemistry\",\"volume\":\"695 \",\"pages\":\"Article 115654\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analytical biochemistry\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0003269724001982\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical biochemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003269724001982","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

代谢组学已被广泛应用于人类疾病和环境科学领域,以研究代谢物在不同刺激下的系统性变化。基于核磁共振的代谢组学已得到广泛应用,但一维(1D)核磁共振谱图中的峰重叠问题会限制代谢组学应用中定量分析的准确性。二维(2D)核磁共振已被用于解决一维核磁共振重叠问题,但数据处理仍具有挑战性。在本研究中,我们利用机器学习方法建立了一种自动方法来处理二维 NMR 数据,以便进行定量应用。我们将偏最小二乘法判别分析(PLS-DA)、人工神经网络分类(ANN-DA)、梯度提升树分类(XGBoost-DA)和人工深度学习神经网络分类(ANNDL-DA)与自动峰值选择方法结合使用。我们测试了标准混合物、海葵提取物和小鼠粪便样本,以验证该方法。结果表明,ANN-DA 和 ANNDL-DA 在选择二维核磁共振峰方面具有很高的准确率(约 90%),在基于二维核磁共振的代谢组学定量研究中具有很高的应用潜力,而 PLS-DA 和 XGBoost-DA 则在数据变化或过拟合方面表现出局限性。我们的研究建立了一种将二维核磁共振数据应用于代谢组学常规定量分析的自动方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

The development of machine learning approaches in two-dimensional NMR data interpretation for metabolomics applications

The development of machine learning approaches in two-dimensional NMR data interpretation for metabolomics applications

Metabolomics has been widely applied in human diseases and environmental science to study the systematic changes of metabolites over diverse types of stimuli. NMR-based metabolomics has been widely used, but the peak overlap problems in the one-dimensional (1D) NMR spectrum could limit the accuracy of quantitative analysis for metabolomics applications. Two-dimensional (2D) NMR has been applied to solve the 1D NMR overlap problem, but the data processing is still challenging. In this study, we built an automatic approach to process the 2D NMR data for quantitative applications using machine learning approaches. Partial least square discriminant analysis (PLS-DA), artificial neural network classification (ANN-DA), gradient boosted trees classification (XGBoost-DA), and artificial deep learning neural network classification (ANNDL-DA) were applied in combination with an automatic peak selection approach. Standard mixtures, sea anemone extracts, and mouse fecal samples were tested to demonstrate the approach. Our results showed that ANN-DA and ANNDL-DA have high accuracy in selecting 2D NMR peaks (around 90 %), which have a high potential application in 2D NMR-based metabolomics quantitively study, while PLS-DA and XGBoost-DA showed limitations in either data variation or overfitting. Our study built an automatic approach to applying 2D NMR data to routine quantitative analysis in metabolomics.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Analytical biochemistry
Analytical biochemistry 生物-分析化学
CiteScore
5.70
自引率
0.00%
发文量
283
审稿时长
44 days
期刊介绍: The journal''s title Analytical Biochemistry: Methods in the Biological Sciences declares its broad scope: methods for the basic biological sciences that include biochemistry, molecular genetics, cell biology, proteomics, immunology, bioinformatics and wherever the frontiers of research take the field. The emphasis is on methods from the strictly analytical to the more preparative that would include novel approaches to protein purification as well as improvements in cell and organ culture. The actual techniques are equally inclusive ranging from aptamers to zymology. The journal has been particularly active in: -Analytical techniques for biological molecules- Aptamer selection and utilization- Biosensors- Chromatography- Cloning, sequencing and mutagenesis- Electrochemical methods- Electrophoresis- Enzyme characterization methods- Immunological approaches- Mass spectrometry of proteins and nucleic acids- Metabolomics- Nano level techniques- Optical spectroscopy in all its forms. The journal is reluctant to include most drug and strictly clinical studies as there are more suitable publication platforms for these types of papers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信