基于质谱代谢组学的化合物鉴定连续相似度量的比较分析

IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS
Hunter Dlugas , Xiang Zhang , Seongho Kim
{"title":"基于质谱代谢组学的化合物鉴定连续相似度量的比较分析","authors":"Hunter Dlugas ,&nbsp;Xiang Zhang ,&nbsp;Seongho Kim","doi":"10.1016/j.chemolab.2025.105417","DOIUrl":null,"url":null,"abstract":"<div><div>In mass spectrometry (MS)-based metabolomics, the most straightforward and efficient approach for compound identification is the comparison of similarity scores between experimental spectra and reference spectra. Among various single and composite similarity measures, the Cosine Correlation is favored due to its simplicity, efficiency, and effectiveness. Recently, the Shannon Entropy Correlation has shown superior performance over several other measures, including the Cosine Correlation, in LC-MS-based metabolomics, particularly concerning receiver operating characteristic (ROC) curves and false discovery rates. However, previous comparisons did not consider the weight factor transformation, which is critical for achieving higher accuracy with the cosine correlation. This study conducted a comparative analysis of the Cosine Correlation and Shannon Entropy Correlation, incorporating the weight factor transformation during preprocessing. Additionally, we developed a novel entropy correlation measure, the Tsallis Entropy Correlation, which offers greater versatility than the Shannon Entropy Correlation. Our accuracy-based results indicate that the weight factor transformation is essential for achieving higher identification performance in both LC-MS and GC-MS-based compound identification. Although the Tsallis Entropy Correlation outperforms the Shannon Entropy Correlation in terms of accuracy, it comes with higher computational expense. In contrast, the Cosine Correlation, when combined with the weight factor transformation, achieves the highest accuracy and the lowest computational expense, demonstrating both robustness and efficiency in MS-based compound identification.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105417"},"PeriodicalIF":3.7000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative analysis of continuous similarity measures for compound identification in mass spectrometry-based metabolomics\",\"authors\":\"Hunter Dlugas ,&nbsp;Xiang Zhang ,&nbsp;Seongho Kim\",\"doi\":\"10.1016/j.chemolab.2025.105417\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In mass spectrometry (MS)-based metabolomics, the most straightforward and efficient approach for compound identification is the comparison of similarity scores between experimental spectra and reference spectra. Among various single and composite similarity measures, the Cosine Correlation is favored due to its simplicity, efficiency, and effectiveness. Recently, the Shannon Entropy Correlation has shown superior performance over several other measures, including the Cosine Correlation, in LC-MS-based metabolomics, particularly concerning receiver operating characteristic (ROC) curves and false discovery rates. However, previous comparisons did not consider the weight factor transformation, which is critical for achieving higher accuracy with the cosine correlation. This study conducted a comparative analysis of the Cosine Correlation and Shannon Entropy Correlation, incorporating the weight factor transformation during preprocessing. Additionally, we developed a novel entropy correlation measure, the Tsallis Entropy Correlation, which offers greater versatility than the Shannon Entropy Correlation. Our accuracy-based results indicate that the weight factor transformation is essential for achieving higher identification performance in both LC-MS and GC-MS-based compound identification. Although the Tsallis Entropy Correlation outperforms the Shannon Entropy Correlation in terms of accuracy, it comes with higher computational expense. In contrast, the Cosine Correlation, when combined with the weight factor transformation, achieves the highest accuracy and the lowest computational expense, demonstrating both robustness and efficiency in MS-based compound identification.</div></div>\",\"PeriodicalId\":9774,\"journal\":{\"name\":\"Chemometrics and Intelligent Laboratory Systems\",\"volume\":\"263 \",\"pages\":\"Article 105417\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemometrics and Intelligent Laboratory Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169743925001029\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743925001029","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

在基于质谱(MS)的代谢组学中,最直接和有效的化合物鉴定方法是比较实验光谱和参考光谱之间的相似性得分。在各种单一和复合相似性度量中,余弦相关由于其简单,效率和有效性而受到青睐。最近,在基于lc - ms的代谢组学中,香农熵相关显示出优于其他几种测量方法的性能,包括余弦相关,特别是在接受者工作特征(ROC)曲线和错误发现率方面。然而,之前的比较没有考虑权重因子变换,这对于通过余弦相关获得更高的精度至关重要。本文对余弦相关和香农熵相关进行了对比分析,并在预处理过程中加入了权重因子变换。此外,我们开发了一种新的熵相关度量,即Tsallis熵相关,它比Shannon熵相关具有更大的通用性。我们基于准确性的结果表明,权重因子转换对于在LC-MS和gc - ms的化合物鉴定中获得更高的鉴定性能至关重要。虽然Tsallis熵相关在精度方面优于Shannon熵相关,但它的计算成本更高。相比之下,余弦相关与权重因子变换相结合,可以获得最高的精度和最低的计算成本,在基于ms的化合物识别中具有鲁棒性和效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative analysis of continuous similarity measures for compound identification in mass spectrometry-based metabolomics
In mass spectrometry (MS)-based metabolomics, the most straightforward and efficient approach for compound identification is the comparison of similarity scores between experimental spectra and reference spectra. Among various single and composite similarity measures, the Cosine Correlation is favored due to its simplicity, efficiency, and effectiveness. Recently, the Shannon Entropy Correlation has shown superior performance over several other measures, including the Cosine Correlation, in LC-MS-based metabolomics, particularly concerning receiver operating characteristic (ROC) curves and false discovery rates. However, previous comparisons did not consider the weight factor transformation, which is critical for achieving higher accuracy with the cosine correlation. This study conducted a comparative analysis of the Cosine Correlation and Shannon Entropy Correlation, incorporating the weight factor transformation during preprocessing. Additionally, we developed a novel entropy correlation measure, the Tsallis Entropy Correlation, which offers greater versatility than the Shannon Entropy Correlation. Our accuracy-based results indicate that the weight factor transformation is essential for achieving higher identification performance in both LC-MS and GC-MS-based compound identification. Although the Tsallis Entropy Correlation outperforms the Shannon Entropy Correlation in terms of accuracy, it comes with higher computational expense. In contrast, the Cosine Correlation, when combined with the weight factor transformation, achieves the highest accuracy and the lowest computational expense, demonstrating both robustness and efficiency in MS-based compound identification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.50
自引率
7.70%
发文量
169
审稿时长
3.4 months
期刊介绍: Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信