Olatomiwa O. Bifarin*, Varun S. Yelluru, Aditya Simhadri and Facundo M. Fernández*,
{"title":"A Large Language Model–Powered Map of Metabolomics Research","authors":"Olatomiwa O. Bifarin*, Varun S. Yelluru, Aditya Simhadri and Facundo M. Fernández*, ","doi":"10.1021/acs.analchem.5c01672","DOIUrl":null,"url":null,"abstract":"<p >We present a comprehensive map of the metabolomics research landscape, synthesizing insights from over 80,000 publications. Using PubMedBERT, we transformed abstracts into 768-dimensional embeddings that capture the nuanced thematic structure of the field. Dimensionality reduction with t-SNE revealed distinct clusters corresponding to key domains, such as analytical chemistry, plant biology, pharmacology, and clinical diagnostics. In addition, a neural topic modeling pipeline refined with GPT-4o mini reclassified the corpus into 20 distinct topics─ranging from “Plant Stress Response Mechanisms” and “NMR Spectroscopy Innovations” to “COVID-19 Metabolomic and Immune Responses.” Temporal analyses further highlight trends including the rise of deep learning methods post-2015 and a continued focus on biomarker discovery. Integration of metadata such as publication statistics and sample sizes provides additional context to these evolving research dynamics. An interactive web application (https://metascape.streamlit.app/) enables the dynamic exploration of these insights. Overall, this study offers a robust framework for literature synthesis that empowers researchers, clinicians, and policymakers to identify emerging research trajectories and address critical challenges in metabolomics while also sharing our perspectives on key trends shaping the field.</p>","PeriodicalId":27,"journal":{"name":"Analytical Chemistry","volume":"97 27","pages":"14088–14096"},"PeriodicalIF":6.7000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12268820/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.analchem.5c01672","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
We present a comprehensive map of the metabolomics research landscape, synthesizing insights from over 80,000 publications. Using PubMedBERT, we transformed abstracts into 768-dimensional embeddings that capture the nuanced thematic structure of the field. Dimensionality reduction with t-SNE revealed distinct clusters corresponding to key domains, such as analytical chemistry, plant biology, pharmacology, and clinical diagnostics. In addition, a neural topic modeling pipeline refined with GPT-4o mini reclassified the corpus into 20 distinct topics─ranging from “Plant Stress Response Mechanisms” and “NMR Spectroscopy Innovations” to “COVID-19 Metabolomic and Immune Responses.” Temporal analyses further highlight trends including the rise of deep learning methods post-2015 and a continued focus on biomarker discovery. Integration of metadata such as publication statistics and sample sizes provides additional context to these evolving research dynamics. An interactive web application (https://metascape.streamlit.app/) enables the dynamic exploration of these insights. Overall, this study offers a robust framework for literature synthesis that empowers researchers, clinicians, and policymakers to identify emerging research trajectories and address critical challenges in metabolomics while also sharing our perspectives on key trends shaping the field.
期刊介绍:
Analytical Chemistry, a peer-reviewed research journal, focuses on disseminating new and original knowledge across all branches of analytical chemistry. Fundamental articles may explore general principles of chemical measurement science and need not directly address existing or potential analytical methodology. They can be entirely theoretical or report experimental results. Contributions may cover various phases of analytical operations, including sampling, bioanalysis, electrochemistry, mass spectrometry, microscale and nanoscale systems, environmental analysis, separations, spectroscopy, chemical reactions and selectivity, instrumentation, imaging, surface analysis, and data processing. Papers discussing known analytical methods should present a significant, original application of the method, a notable improvement, or results on an important analyte.