Yingjiao Shi, Ji Yang, Qianxu Yang, Yipeng Zhang, Zhongda Zeng
{"title":"基于高分辨率质谱(HRMS) MS/MS数据综合模拟和相似度评分的代谢物注释质量评价","authors":"Yingjiao Shi, Ji Yang, Qianxu Yang, Yipeng Zhang, Zhongda Zeng","doi":"10.1007/s00216-025-05847-7","DOIUrl":null,"url":null,"abstract":"<p><p>Metabolite annotation is a critical step in discovery metabolomics, but remains a significant challenge. In this study, the accuracy of metabolite annotation was systematically evaluated by leveraging the proposed strategies for simulation of tandem mass spectrometry (MS/MS) data from high-resolution mass spectrometry (HRMS) and then construction of a large-scale virtual database. Furthermore, various similarity scoring methods were comprehensively compared to assess the performance for annotation. First, three key characteristics that are essential for simulating MS/MS spectra to closely resemble experimental data were identified: (i) the number of mass-to-charge ratio (m/z) features, (ii) the differences between neighboring m/z values, and (iii) the intensity distribution of MS/MS features. These factors were employed to generate representative MS/MS spectra for subsequent study. A meticulously designed virtual MS/MS database was constructed to facilitate accurate annotation assessment, which covered over 100,000 metabolites with diverse structural similarities and differences. To evaluate annotation quality, two simulation strategies on the basis of strong and weak data inference were respectively proposed to replicate MS/MS spectra for unknown metabolites. These simulated spectra were then compared with the virtual database, which provided insights into the expected variations in experimental MS/MS data. Furthermore, eight similarity evaluation methods, including entropy similarity (ES) and weighted dot product (W/DP) algorithms, were rigorously evaluated for their effectiveness in metabolite annotation. The results revealed that some methods, such as ES, exhibited strong resistance to interference and broad adaptability across different MS/MS patterns, whereas others selectively yielded reliable outcomes under specific conditions. This study provided a systematic framework for quality evaluation in metabolite annotation and offered strategies to mitigate false-positive identifications. The findings held great significance for advancing metabolomics research and further improving annotation reliability in complex biological samples.</p>","PeriodicalId":462,"journal":{"name":"Analytical and Bioanalytical Chemistry","volume":" ","pages":"3061-3077"},"PeriodicalIF":3.8000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quality evaluation of metabolite annotation based on comprehensive simulation of MS/MS data from high-resolution mass spectrometry (HRMS) and similarity scoring.\",\"authors\":\"Yingjiao Shi, Ji Yang, Qianxu Yang, Yipeng Zhang, Zhongda Zeng\",\"doi\":\"10.1007/s00216-025-05847-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Metabolite annotation is a critical step in discovery metabolomics, but remains a significant challenge. In this study, the accuracy of metabolite annotation was systematically evaluated by leveraging the proposed strategies for simulation of tandem mass spectrometry (MS/MS) data from high-resolution mass spectrometry (HRMS) and then construction of a large-scale virtual database. Furthermore, various similarity scoring methods were comprehensively compared to assess the performance for annotation. First, three key characteristics that are essential for simulating MS/MS spectra to closely resemble experimental data were identified: (i) the number of mass-to-charge ratio (m/z) features, (ii) the differences between neighboring m/z values, and (iii) the intensity distribution of MS/MS features. These factors were employed to generate representative MS/MS spectra for subsequent study. A meticulously designed virtual MS/MS database was constructed to facilitate accurate annotation assessment, which covered over 100,000 metabolites with diverse structural similarities and differences. To evaluate annotation quality, two simulation strategies on the basis of strong and weak data inference were respectively proposed to replicate MS/MS spectra for unknown metabolites. These simulated spectra were then compared with the virtual database, which provided insights into the expected variations in experimental MS/MS data. Furthermore, eight similarity evaluation methods, including entropy similarity (ES) and weighted dot product (W/DP) algorithms, were rigorously evaluated for their effectiveness in metabolite annotation. The results revealed that some methods, such as ES, exhibited strong resistance to interference and broad adaptability across different MS/MS patterns, whereas others selectively yielded reliable outcomes under specific conditions. This study provided a systematic framework for quality evaluation in metabolite annotation and offered strategies to mitigate false-positive identifications. The findings held great significance for advancing metabolomics research and further improving annotation reliability in complex biological samples.</p>\",\"PeriodicalId\":462,\"journal\":{\"name\":\"Analytical and Bioanalytical Chemistry\",\"volume\":\" \",\"pages\":\"3061-3077\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analytical and Bioanalytical Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1007/s00216-025-05847-7\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/18 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical and Bioanalytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s00216-025-05847-7","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/18 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Quality evaluation of metabolite annotation based on comprehensive simulation of MS/MS data from high-resolution mass spectrometry (HRMS) and similarity scoring.
Metabolite annotation is a critical step in discovery metabolomics, but remains a significant challenge. In this study, the accuracy of metabolite annotation was systematically evaluated by leveraging the proposed strategies for simulation of tandem mass spectrometry (MS/MS) data from high-resolution mass spectrometry (HRMS) and then construction of a large-scale virtual database. Furthermore, various similarity scoring methods were comprehensively compared to assess the performance for annotation. First, three key characteristics that are essential for simulating MS/MS spectra to closely resemble experimental data were identified: (i) the number of mass-to-charge ratio (m/z) features, (ii) the differences between neighboring m/z values, and (iii) the intensity distribution of MS/MS features. These factors were employed to generate representative MS/MS spectra for subsequent study. A meticulously designed virtual MS/MS database was constructed to facilitate accurate annotation assessment, which covered over 100,000 metabolites with diverse structural similarities and differences. To evaluate annotation quality, two simulation strategies on the basis of strong and weak data inference were respectively proposed to replicate MS/MS spectra for unknown metabolites. These simulated spectra were then compared with the virtual database, which provided insights into the expected variations in experimental MS/MS data. Furthermore, eight similarity evaluation methods, including entropy similarity (ES) and weighted dot product (W/DP) algorithms, were rigorously evaluated for their effectiveness in metabolite annotation. The results revealed that some methods, such as ES, exhibited strong resistance to interference and broad adaptability across different MS/MS patterns, whereas others selectively yielded reliable outcomes under specific conditions. This study provided a systematic framework for quality evaluation in metabolite annotation and offered strategies to mitigate false-positive identifications. The findings held great significance for advancing metabolomics research and further improving annotation reliability in complex biological samples.
期刊介绍:
Analytical and Bioanalytical Chemistry’s mission is the rapid publication of excellent and high-impact research articles on fundamental and applied topics of analytical and bioanalytical measurement science. Its scope is broad, and ranges from novel measurement platforms and their characterization to multidisciplinary approaches that effectively address important scientific problems. The Editors encourage submissions presenting innovative analytical research in concept, instrumentation, methods, and/or applications, including: mass spectrometry, spectroscopy, and electroanalysis; advanced separations; analytical strategies in “-omics” and imaging, bioanalysis, and sampling; miniaturized devices, medical diagnostics, sensors; analytical characterization of nano- and biomaterials; chemometrics and advanced data analysis.