{"title":"The development of machine learning approaches in two-dimensional NMR data interpretation for metabolomics applications","authors":"Julie Pollak , Moses Mayonu , Lin Jiang , Bo Wang","doi":"10.1016/j.ab.2024.115654","DOIUrl":null,"url":null,"abstract":"<div><p>Metabolomics has been widely applied in human diseases and environmental science to study the systematic changes of metabolites over diverse types of stimuli. NMR-based metabolomics has been widely used, but the peak overlap problems in the one-dimensional (1D) NMR spectrum could limit the accuracy of quantitative analysis for metabolomics applications. Two-dimensional (2D) NMR has been applied to solve the 1D NMR overlap problem, but the data processing is still challenging. In this study, we built an automatic approach to process the 2D NMR data for quantitative applications using machine learning approaches. Partial least square discriminant analysis (PLS-DA), artificial neural network classification (ANN-DA), gradient boosted trees classification (XGBoost-DA), and artificial deep learning neural network classification (ANNDL-DA) were applied in combination with an automatic peak selection approach. Standard mixtures, sea anemone extracts, and mouse fecal samples were tested to demonstrate the approach. Our results showed that ANN-DA and ANNDL-DA have high accuracy in selecting 2D NMR peaks (around 90 %), which have a high potential application in 2D NMR-based metabolomics quantitively study, while PLS-DA and XGBoost-DA showed limitations in either data variation or overfitting. Our study built an automatic approach to applying 2D NMR data to routine quantitative analysis in metabolomics.</p></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"695 ","pages":"Article 115654"},"PeriodicalIF":2.6000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical biochemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003269724001982","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Metabolomics has been widely applied in human diseases and environmental science to study the systematic changes of metabolites over diverse types of stimuli. NMR-based metabolomics has been widely used, but the peak overlap problems in the one-dimensional (1D) NMR spectrum could limit the accuracy of quantitative analysis for metabolomics applications. Two-dimensional (2D) NMR has been applied to solve the 1D NMR overlap problem, but the data processing is still challenging. In this study, we built an automatic approach to process the 2D NMR data for quantitative applications using machine learning approaches. Partial least square discriminant analysis (PLS-DA), artificial neural network classification (ANN-DA), gradient boosted trees classification (XGBoost-DA), and artificial deep learning neural network classification (ANNDL-DA) were applied in combination with an automatic peak selection approach. Standard mixtures, sea anemone extracts, and mouse fecal samples were tested to demonstrate the approach. Our results showed that ANN-DA and ANNDL-DA have high accuracy in selecting 2D NMR peaks (around 90 %), which have a high potential application in 2D NMR-based metabolomics quantitively study, while PLS-DA and XGBoost-DA showed limitations in either data variation or overfitting. Our study built an automatic approach to applying 2D NMR data to routine quantitative analysis in metabolomics.
期刊介绍:
The journal''s title Analytical Biochemistry: Methods in the Biological Sciences declares its broad scope: methods for the basic biological sciences that include biochemistry, molecular genetics, cell biology, proteomics, immunology, bioinformatics and wherever the frontiers of research take the field.
The emphasis is on methods from the strictly analytical to the more preparative that would include novel approaches to protein purification as well as improvements in cell and organ culture. The actual techniques are equally inclusive ranging from aptamers to zymology.
The journal has been particularly active in:
-Analytical techniques for biological molecules-
Aptamer selection and utilization-
Biosensors-
Chromatography-
Cloning, sequencing and mutagenesis-
Electrochemical methods-
Electrophoresis-
Enzyme characterization methods-
Immunological approaches-
Mass spectrometry of proteins and nucleic acids-
Metabolomics-
Nano level techniques-
Optical spectroscopy in all its forms.
The journal is reluctant to include most drug and strictly clinical studies as there are more suitable publication platforms for these types of papers.