TransExION: a transformer based explainable similarity metric for comparing IONS in tandem mass spectrometry

IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Danh Bui-Thi, Youzhong Liu, Jennifer L. Lippens, Kris Laukens, Thomas De Vijlder
{"title":"TransExION: a transformer based explainable similarity metric for comparing IONS in tandem mass spectrometry","authors":"Danh Bui-Thi,&nbsp;Youzhong Liu,&nbsp;Jennifer L. Lippens,&nbsp;Kris Laukens,&nbsp;Thomas De Vijlder","doi":"10.1186/s13321-024-00858-5","DOIUrl":null,"url":null,"abstract":"<p>Small molecule identification is a crucial task in analytical chemistry and life sciences. One of the most commonly used technologies to elucidate small molecule structures is mass spectrometry. Spectral library search of product ion spectra (MS/MS) is a popular strategy to identify or find structural analogues. This approach relies on the assumption that spectral similarity and structural similarity are correlated. However, popular spectral similarity measures, usually calculated based on identical fragment matches between the MS/MS spectra, do not always accurately reflect the structural similarity. In this study, we propose TransExION, a Transformer based Explainable similarity metric for IONS. TransExION detects related fragments between MS/MS spectra through their mass difference and uses these to estimate spectral similarity. These related fragments can be nearly identical, but can also share a substructure. TransExION also provides a post-hoc explanation of its estimation, which can be used to support scientists in evaluating the spectral library search results and thus in structure elucidation of unknown molecules. Our model has a Transformer based architecture and it is trained on the data derived from GNPS MS/MS libraries. The experimental results show that it improves existing spectral similarity measures in searching and interpreting structural analogues as well as in molecular networking.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00858-5","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-024-00858-5","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Small molecule identification is a crucial task in analytical chemistry and life sciences. One of the most commonly used technologies to elucidate small molecule structures is mass spectrometry. Spectral library search of product ion spectra (MS/MS) is a popular strategy to identify or find structural analogues. This approach relies on the assumption that spectral similarity and structural similarity are correlated. However, popular spectral similarity measures, usually calculated based on identical fragment matches between the MS/MS spectra, do not always accurately reflect the structural similarity. In this study, we propose TransExION, a Transformer based Explainable similarity metric for IONS. TransExION detects related fragments between MS/MS spectra through their mass difference and uses these to estimate spectral similarity. These related fragments can be nearly identical, but can also share a substructure. TransExION also provides a post-hoc explanation of its estimation, which can be used to support scientists in evaluating the spectral library search results and thus in structure elucidation of unknown molecules. Our model has a Transformer based architecture and it is trained on the data derived from GNPS MS/MS libraries. The experimental results show that it improves existing spectral similarity measures in searching and interpreting structural analogues as well as in molecular networking.

TransExION:基于转换器的可解释相似性指标,用于比较串联质谱中的离子。
小分子鉴定是分析化学和生命科学领域的一项重要任务。质谱技术是阐明小分子结构最常用的技术之一。产物离子谱(MS/MS)的谱库搜索是识别或寻找结构类似物的常用策略。这种方法依赖于光谱相似性和结构相似性相互关联的假设。然而,流行的光谱相似性度量通常是根据 MS/MS 图谱之间的相同片段匹配度来计算的,并不总能准确反映结构相似性。在本研究中,我们提出了一种基于变换器的 IONS 可解释相似性度量方法--TransExION。TransExION 通过质量差来检测 MS/MS 图谱之间的相关片段,并利用这些片段来估计光谱相似性。这些相关片段可能几乎相同,但也可能共享一个子结构。TransExION 还会对其估算结果进行事后解释,以帮助科学家评估谱库搜索结果,从而对未知分子进行结构阐释。我们的模型采用基于 Transformer 的架构,并在 GNPS MS/MS 图谱库数据的基础上进行了训练。实验结果表明,该模型在搜索和解释结构类似物以及分子网络方面改进了现有的光谱相似性测量方法。科学贡献:我们提出了一种基于变换器的光谱相似性度量方法,它能改进小分子串联质谱的比较。我们提供了一种事后解释,可作为基于数据库光谱的未知光谱注释的良好起点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信