Raven L. Buckman Johnson, Vy T. Tat and Young Jin Lee
{"title":"Unsupervised machine learning for mass spectrometry imaging data analysis with in vivo isotope labeling","authors":"Raven L. Buckman Johnson, Vy T. Tat and Young Jin Lee","doi":"10.1039/D5AN00649J","DOIUrl":null,"url":null,"abstract":"<p >Mass spectrometry imaging (MSI) has emerged as a powerful tool for spatial metabolomics, but untargeted data analysis has proven to be challenging. When combined with <em>in vivo</em> isotope labeling (MSI<em>i</em>), MSI provides insights into metabolic dynamics with high spatial resolution; however, the data analysis becomes even more complex. Although various tools exist for advanced MSI analyses, machine learning (ML) applications to MSI<em>i</em> have not been explored. In this study, we leverage Cardinal to process MSI<em>i</em> datasets of duckweeds labeled with either <small><sup>13</sup></small>CO<small><sub>2</sub></small> or D<small><sub>2</sub></small>O. We apply spatial shrunken centroid (SSC) segmentation, an unsupervised ML algorithm, to differentiate metabolite localizations and investigate isotope labeling of untargeted metabolites. In the SSC segmentation of three-day <small><sup>13</sup></small>C-labeled duckweed dataset, five spatial segments were identified based on distinct lipid isotopologue distributions, in contrast to classification of only three tissue regions in previous manual analysis based on galactolipid isotopologues. Similarly, SSC segmentation of five-day D-labeled dataset revealed five spatial segments based on distinct metabolite and isotopologue profiles. Further, this untargeted segmentation analysis of MSI<em>i</em> dataset provided insights on tissue-specific relative flux of each metabolite by calculating the fraction of <em>de novo</em> biosynthesis in each segment. Overall, the application of unsupervised machine learning to MSI<em>i</em> datasets has proven to significantly reduce analysis time, increase throughput, and improve the clarity of spatial isotopologue distributions.</p>","PeriodicalId":63,"journal":{"name":"Analyst","volume":" 19","pages":" 4404-4413"},"PeriodicalIF":3.3000,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/an/d5an00649j?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analyst","FirstCategoryId":"92","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/an/d5an00649j","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Mass spectrometry imaging (MSI) has emerged as a powerful tool for spatial metabolomics, but untargeted data analysis has proven to be challenging. When combined with in vivo isotope labeling (MSIi), MSI provides insights into metabolic dynamics with high spatial resolution; however, the data analysis becomes even more complex. Although various tools exist for advanced MSI analyses, machine learning (ML) applications to MSIi have not been explored. In this study, we leverage Cardinal to process MSIi datasets of duckweeds labeled with either 13CO2 or D2O. We apply spatial shrunken centroid (SSC) segmentation, an unsupervised ML algorithm, to differentiate metabolite localizations and investigate isotope labeling of untargeted metabolites. In the SSC segmentation of three-day 13C-labeled duckweed dataset, five spatial segments were identified based on distinct lipid isotopologue distributions, in contrast to classification of only three tissue regions in previous manual analysis based on galactolipid isotopologues. Similarly, SSC segmentation of five-day D-labeled dataset revealed five spatial segments based on distinct metabolite and isotopologue profiles. Further, this untargeted segmentation analysis of MSIi dataset provided insights on tissue-specific relative flux of each metabolite by calculating the fraction of de novo biosynthesis in each segment. Overall, the application of unsupervised machine learning to MSIi datasets has proven to significantly reduce analysis time, increase throughput, and improve the clarity of spatial isotopologue distributions.