{"title":"Mind your Ps and Qs – Caveats in metabolomics data analysis","authors":"Yun Xu, Royston Goodacre","doi":"10.1016/j.trac.2024.118064","DOIUrl":null,"url":null,"abstract":"<div><div>Metabolomics studies use high-throughput analytical platforms to measure metabolites in biological samples. These mass spectrometry and/or NMR spectroscopy platforms generate complex data sets, and the analysis of such data poses many challenges, in particular the high dimensionality with relatively fewer number of samples means that sophisticated statistical models are required to analyse these data and these models come with caveats. In this review, we discuss some of these common caveats associated with most popular statistical tests and models. We present common mistakes found in metabolomics data analysis, along with recommendations on how to avoid them. The aim of this review is to raise awareness of the potential risks of misusing or abusing statistical models, and to promote good practices for reliable and reproducible metabolomics research. A new form of permutation test with emphasis on assessing the statistical significance level of the effect captured by supervised model is also proposed.</div></div>","PeriodicalId":439,"journal":{"name":"Trends in Analytical Chemistry","volume":"183 ","pages":"Article 118064"},"PeriodicalIF":11.8000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Analytical Chemistry","FirstCategoryId":"1","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165993624005478","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Metabolomics studies use high-throughput analytical platforms to measure metabolites in biological samples. These mass spectrometry and/or NMR spectroscopy platforms generate complex data sets, and the analysis of such data poses many challenges, in particular the high dimensionality with relatively fewer number of samples means that sophisticated statistical models are required to analyse these data and these models come with caveats. In this review, we discuss some of these common caveats associated with most popular statistical tests and models. We present common mistakes found in metabolomics data analysis, along with recommendations on how to avoid them. The aim of this review is to raise awareness of the potential risks of misusing or abusing statistical models, and to promote good practices for reliable and reproducible metabolomics research. A new form of permutation test with emphasis on assessing the statistical significance level of the effect captured by supervised model is also proposed.
期刊介绍:
TrAC publishes succinct and critical overviews of recent advancements in analytical chemistry, designed to assist analytical chemists and other users of analytical techniques. These reviews offer excellent, up-to-date, and timely coverage of various topics within analytical chemistry. Encompassing areas such as analytical instrumentation, biomedical analysis, biomolecular analysis, biosensors, chemical analysis, chemometrics, clinical chemistry, drug discovery, environmental analysis and monitoring, food analysis, forensic science, laboratory automation, materials science, metabolomics, pesticide-residue analysis, pharmaceutical analysis, proteomics, surface science, and water analysis and monitoring, these critical reviews provide comprehensive insights for practitioners in the field.