The Dark Metabolome/Lipidome and In-Source Fragmentation

IF 3 Q2 CHEMISTRY, ANALYTICAL
Winnie Uritboonthai, Linh Hoang, Aries Aisporna, Martin Giera, Gary Siuzdak
{"title":"The Dark Metabolome/Lipidome and In-Source Fragmentation","authors":"Winnie Uritboonthai,&nbsp;Linh Hoang,&nbsp;Aries Aisporna,&nbsp;Martin Giera,&nbsp;Gary Siuzdak","doi":"10.1002/ansa.70012","DOIUrl":null,"url":null,"abstract":"<p>To the editor,</p><p>Tandem mass spectrometry (MS/MS) is valued for its ability to facilitate molecular identification and deliver highly consistent data across a wide range of mass spectrometry platforms. Distinct from MS/MS is the fragmentation that occurs during electrospray ionization (ESI), commonly referred to as in-source fragmentation (ISF) (Figure 1). ISF was first observed in the 1950s with electron ionization and has been recognized as an inherent yet often overlooked feature of the ESI process, albeit less prevalent than with electron ionization. Recently, ISF has been associated with the overrepresentation of peaks in liquid chromatography mass spectrometry (LC/MS) data, where it accounts for the majority of observed unfiltered peaks [<span>1</span>]. Due to its overrepresentation in LC/MS data, and the subsequent inability to identify the molecules associated with these peaks using MS/MS data, ISF has been linked to the so-called “dark metabolome” [<span>2, 3</span>] (also encompassing the lipidome), a term used to describe uncharacterized molecular species in metabolomics and lipidomics. This association [<span>1</span>] was determined by an examination of MS/MS data acquired at 0 eV collision energy from METLIN's extensive library of over 931,000 molecular standards. However, while the similarity of ISF and MS/MS at 0 eV data has been described in previous studies [<span>1, 4</span>–<span>6</span>], it has yet to be directly established that they correlate with each other. We explored the consistency between MS/MS (0 eV) data and ISF across various molecular species to assess whether mining METLIN's MS/MS (0 eV) data—comprising over 931,000 molecular standards—can effectively link ISF to the dark metabolome and lipidome.</p><p>Liquid chromatography-tandem mass spectrometry (LC-MS/MS) with ESI has become a cornerstone in metabolomics, lipidomics, and clinical analysis due to its accuracy in identifying small molecules within complex biological matrices. With LC-MS/MS, after ionization occurs in the ESI source, charged molecules are directed into a collision cell where they undergo fragmentation for structural analysis. This procedure is typically repeated for all charged analytes present in a sample. However, despite its utility, this method has revealed an unexpectedly vast array of spectral features associated with the “dark metabolome.” However, given the limited number of protein-coding genes [<span>7, 8</span>] with only a fraction producing enzymes, the chemical diversity [<span>3, 9, 10</span>] detected through LC-MS/MS—potentially hundreds of thousands or even millions of metabolites—far exceeds biological expectations. Current estimates suggest that less than 2% of observed LC-MS/MS spectra can be annotated, a potentially broad spectrum of unknown compounds [<span>3</span>]. Recent research [<span>1</span>] using the METLIN database and its data at 0 eV has shed light on this discrepancy, and much of the perceived complexity may stem from technological factors, particularly ISF, rather than from biological diversity itself.</p><p>Our laboratory, along with several others [<span>11</span>], has observed the widespread occurrence of ISF [<span>12, 13</span>]. This process involves the fragmentation of analytes during the initial ionization stage within the ESI source, occurring before they reach the collision cell. Essentially, ISF can transform a single analyte into multiple molecular ions and fragments, creating a complex array of ions from what was initially a single entity. Consequently, the mass analyzer indiscriminately isolates and further fragments whatever enters the collision cell. Given this understanding, we suspect that ISF may play a significant role in contributing to the so-called dark metabolome.</p><p>In order to correlate the observation of peaks and ISF, we examined the METLIN MS/MS database [<span>14</span>], which consists of over 931,000 molecular standards representing over 350 chemical classes in which we mined METLIN's MS/MS data at 0 eV, an energy designed to simulate the absence of CID. This analysis was performed to assess whether MS/MS spectra acquired at 0 eV collision energy in METLIN could reflect ISF-related fragments. The analysis revealed that ISF could account for over 70% of the peaks observed in typical LC-MS/MS metabolomic datasets when using a 5% cutoff threshold. This number rises when the threshold is reduced to less than 3%. The 5% and 3% thresholds represent a conservative range of peak intensities across LC/MS experiments, where the typical intensity count numbers range from 10000 to millions, well over two orders of magnitude.</p><p>While the METLIN study provides a large statistical snapshot of the number of ISF peaks in a typical LC/MS experiment, it lacked example data directly comparing the similarity between ISF and MS/MS (0 eV) data. Here, we examined both types of data (METLIN MS/MS 0 eV and ISF) from 10 molecules (Figures 2 and 3). ISF data were acquired using both an Agilent QTOF (collision cell off) mass spectrometer and an Agilent TOF mass spectrometer. The data revealed a high level of consistency between METLIN MS/MS 0 eV and ISF produced fragment ions, although the intensities were generally higher for the ISF generated fragment ion peaks. These examples suggest that (1) the original comparison between ISF and MS/MS (0 eV) is valid, and (2) the higher intensities observed for ISF fragments indicate that ISF process is slightly more energetic than MS/MS (0 eV), at least with these two instrument platforms (Agilent QTOF and Agilent TOF).</p><p>Overall, these comparative examples between METLIN MS/MS (0 eV) data and ISF provide another level of evidence that the peaks observed in LC/MS experiments are predominantly associated with ISF. Figure 4 also illustrates the conceptual reasoning behind this logic, where MS/MS data are generated on all the unfiltered observable LC/MS peaks. Given the prevalence of ISF, most of the MS/MS data do not represent real molecules but instead fragment ions. This would explain why so many peaks are not identifiable in current tandem mass spectrometry databases.</p><p><b>Winnie Uritboonthai</b>: Data curation, formal analysis. <b>Linh Hoang</b>: Data acquisition. <b>Aries Aisporna</b>: Software. <b>Martin Giera</b>: Writing–review &amp; editing. <b>Gary Siuzdak</b>: Experimental design, formal analysis, writing.</p><p>The authors declare no conflicts of interest.</p>","PeriodicalId":93411,"journal":{"name":"Analytical science advances","volume":"6 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ansa.70012","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical science advances","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ansa.70012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

Abstract

To the editor,

Tandem mass spectrometry (MS/MS) is valued for its ability to facilitate molecular identification and deliver highly consistent data across a wide range of mass spectrometry platforms. Distinct from MS/MS is the fragmentation that occurs during electrospray ionization (ESI), commonly referred to as in-source fragmentation (ISF) (Figure 1). ISF was first observed in the 1950s with electron ionization and has been recognized as an inherent yet often overlooked feature of the ESI process, albeit less prevalent than with electron ionization. Recently, ISF has been associated with the overrepresentation of peaks in liquid chromatography mass spectrometry (LC/MS) data, where it accounts for the majority of observed unfiltered peaks [1]. Due to its overrepresentation in LC/MS data, and the subsequent inability to identify the molecules associated with these peaks using MS/MS data, ISF has been linked to the so-called “dark metabolome” [2, 3] (also encompassing the lipidome), a term used to describe uncharacterized molecular species in metabolomics and lipidomics. This association [1] was determined by an examination of MS/MS data acquired at 0 eV collision energy from METLIN's extensive library of over 931,000 molecular standards. However, while the similarity of ISF and MS/MS at 0 eV data has been described in previous studies [1, 46], it has yet to be directly established that they correlate with each other. We explored the consistency between MS/MS (0 eV) data and ISF across various molecular species to assess whether mining METLIN's MS/MS (0 eV) data—comprising over 931,000 molecular standards—can effectively link ISF to the dark metabolome and lipidome.

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) with ESI has become a cornerstone in metabolomics, lipidomics, and clinical analysis due to its accuracy in identifying small molecules within complex biological matrices. With LC-MS/MS, after ionization occurs in the ESI source, charged molecules are directed into a collision cell where they undergo fragmentation for structural analysis. This procedure is typically repeated for all charged analytes present in a sample. However, despite its utility, this method has revealed an unexpectedly vast array of spectral features associated with the “dark metabolome.” However, given the limited number of protein-coding genes [7, 8] with only a fraction producing enzymes, the chemical diversity [3, 9, 10] detected through LC-MS/MS—potentially hundreds of thousands or even millions of metabolites—far exceeds biological expectations. Current estimates suggest that less than 2% of observed LC-MS/MS spectra can be annotated, a potentially broad spectrum of unknown compounds [3]. Recent research [1] using the METLIN database and its data at 0 eV has shed light on this discrepancy, and much of the perceived complexity may stem from technological factors, particularly ISF, rather than from biological diversity itself.

Our laboratory, along with several others [11], has observed the widespread occurrence of ISF [12, 13]. This process involves the fragmentation of analytes during the initial ionization stage within the ESI source, occurring before they reach the collision cell. Essentially, ISF can transform a single analyte into multiple molecular ions and fragments, creating a complex array of ions from what was initially a single entity. Consequently, the mass analyzer indiscriminately isolates and further fragments whatever enters the collision cell. Given this understanding, we suspect that ISF may play a significant role in contributing to the so-called dark metabolome.

In order to correlate the observation of peaks and ISF, we examined the METLIN MS/MS database [14], which consists of over 931,000 molecular standards representing over 350 chemical classes in which we mined METLIN's MS/MS data at 0 eV, an energy designed to simulate the absence of CID. This analysis was performed to assess whether MS/MS spectra acquired at 0 eV collision energy in METLIN could reflect ISF-related fragments. The analysis revealed that ISF could account for over 70% of the peaks observed in typical LC-MS/MS metabolomic datasets when using a 5% cutoff threshold. This number rises when the threshold is reduced to less than 3%. The 5% and 3% thresholds represent a conservative range of peak intensities across LC/MS experiments, where the typical intensity count numbers range from 10000 to millions, well over two orders of magnitude.

While the METLIN study provides a large statistical snapshot of the number of ISF peaks in a typical LC/MS experiment, it lacked example data directly comparing the similarity between ISF and MS/MS (0 eV) data. Here, we examined both types of data (METLIN MS/MS 0 eV and ISF) from 10 molecules (Figures 2 and 3). ISF data were acquired using both an Agilent QTOF (collision cell off) mass spectrometer and an Agilent TOF mass spectrometer. The data revealed a high level of consistency between METLIN MS/MS 0 eV and ISF produced fragment ions, although the intensities were generally higher for the ISF generated fragment ion peaks. These examples suggest that (1) the original comparison between ISF and MS/MS (0 eV) is valid, and (2) the higher intensities observed for ISF fragments indicate that ISF process is slightly more energetic than MS/MS (0 eV), at least with these two instrument platforms (Agilent QTOF and Agilent TOF).

Overall, these comparative examples between METLIN MS/MS (0 eV) data and ISF provide another level of evidence that the peaks observed in LC/MS experiments are predominantly associated with ISF. Figure 4 also illustrates the conceptual reasoning behind this logic, where MS/MS data are generated on all the unfiltered observable LC/MS peaks. Given the prevalence of ISF, most of the MS/MS data do not represent real molecules but instead fragment ions. This would explain why so many peaks are not identifiable in current tandem mass spectrometry databases.

Winnie Uritboonthai: Data curation, formal analysis. Linh Hoang: Data acquisition. Aries Aisporna: Software. Martin Giera: Writing–review & editing. Gary Siuzdak: Experimental design, formal analysis, writing.

The authors declare no conflicts of interest.

暗代谢组/脂质组和源内碎片化
对编辑来说,串联质谱(MS/MS)因其促进分子鉴定和在广泛的质谱平台上提供高度一致的数据的能力而受到重视。与质谱不同的是,在电喷雾电离(ESI)过程中发生的碎片,通常被称为源内碎片(ISF)(图1)。ISF最初是在20世纪50年代通过电子电离观察到的,并被认为是ESI过程的固有特征,但经常被忽视,尽管不像电子电离那么普遍。最近,ISF与液相色谱-质谱(LC/MS)数据中峰的过度代表有关,其中它占观察到的大部分未过滤峰[1]。由于其在LC/MS数据中的代表性过高,以及随后无法使用MS/MS数据识别与这些峰相关的分子,ISF已与所谓的“暗代谢组”(也包括脂质组)联系起来,该术语用于描述代谢组学和脂质组学中未表征的分子物种。这种关联[1]是通过检查从METLIN广泛的超过931,000个分子标准库中获得的碰撞能量为0 eV的MS/MS数据确定的。然而,虽然先前的研究已经描述了ISF和MS/MS在0 eV数据下的相似性[1,4 - 6],但尚未直接确定它们之间的相关性。我们探索了不同分子物种的MS/MS (0 eV)数据与ISF之间的一致性,以评估挖掘METLIN的MS/MS (0 eV)数据(包括超过931,000个分子标准)是否可以有效地将ISF与暗代谢组和脂质组联系起来。液相色谱-串联质谱(LC-MS/MS)与ESI相结合,由于其在识别复杂生物基质中的小分子方面的准确性,已成为代谢组学、脂质组学和临床分析的基石。使用LC-MS/MS,在ESI源发生电离后,带电分子被引导到碰撞细胞中,在那里它们被破碎以进行结构分析。对于样品中存在的所有带电分析物,通常重复此过程。然而,尽管这种方法很实用,但它揭示了与“黑暗代谢组”相关的一系列意想不到的光谱特征。然而,由于蛋白质编码基因的数量有限[7,8],而且只有一小部分产生酶,因此LC-MS/ ms检测到的化学多样性[3,9,10]——可能有数十万甚至数百万种代谢物——远远超出了生物学的预期。目前的估计表明,只有不到2%的LC-MS/MS光谱可以被注释,这是未知化合物[3]的潜在广谱。最近利用METLIN数据库及其在0 eV下的数据进行的研究揭示了这种差异,许多可感知的复杂性可能源于技术因素,特别是ISF,而不是生物多样性本身。我们的实验室和其他几个实验室已经观察到ISF的广泛发生[12,13]。这个过程包括在ESI源的初始电离阶段分析物的破碎,发生在它们到达碰撞单元之前。从本质上讲,ISF可以将单个分析物转化为多个分子离子和碎片,从最初的单个实体创建一个复杂的离子阵列。因此,质谱仪不分青红皂白地分离和进一步碎片任何进入碰撞单元。考虑到这一点,我们怀疑ISF可能在所谓的暗代谢组中起着重要作用。为了将观察到的峰与ISF相关联,我们检查了METLIN MS/MS数据库[14],该数据库包含超过931,000个分子标准,代表超过350个化学类别,我们在0 eV(设计用于模拟没有CID的能量)下挖掘METLIN的MS/MS数据。本文分析了METLIN在0 eV碰撞能量下获得的MS/MS谱是否能反映isf相关碎片。分析显示,当使用5%的截止阈值时,ISF可以占典型LC-MS/MS代谢组学数据集中观察到的峰的70%以上。当阈值降低到3%以下时,这个数字会上升。5%和3%的阈值代表了LC/MS实验中峰值强度的保守范围,其中典型的强度计数范围从10000到数百万,远远超过两个数量级。虽然METLIN研究提供了典型LC/MS实验中ISF峰数量的大量统计快照,但它缺乏直接比较ISF与MS/MS (0 eV)数据之间相似性的示例数据。在这里,我们检查了10个分子的两种类型的数据(METLIN MS/MS 0 eV和ISF)(图2和3)。ISF数据使用Agilent QTOF(碰撞单元关闭)质谱仪和Agilent TOF质谱仪获得。 数据显示METLIN MS/MS 0 eV与ISF产生的碎片离子之间具有高度的一致性,尽管ISF产生的碎片离子峰的强度通常更高。这些例子表明:(1)ISF和MS/MS (0 eV)之间的原始比较是有效的,(2)ISF片段观察到的更高强度表明ISF过程比MS/MS (0 eV)更有能量,至少在这两个仪器平台(Agilent QTOF和Agilent TOF)上是这样。总的来说,这些METLIN MS/MS (0 eV)数据和ISF之间的比较例子提供了另一个层面的证据,证明LC/MS实验中观察到的峰主要与ISF相关。图4还说明了该逻辑背后的概念推理,其中MS/MS数据是在所有未过滤的可观察LC/MS峰上生成的。鉴于ISF的普遍存在,大多数MS/MS数据并不代表真正的分子,而是碎片离子。这可以解释为什么在当前的串联质谱数据库中无法识别出如此多的峰。Winnie Uritboonthai:数据管理,形式分析。林煌:数据采集。白羊座:软件。马丁·吉拉:写作评论&;编辑。Gary Siuzdak:实验设计,形式分析,写作。作者声明无利益冲突。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信