Javier E. Flores, , , David J. Degnan, , , Yuri E. Corilo, , , Chaevien S. Clendinen, , and , Lisa M. Bramer*,
{"title":"多的力量:光谱相似度的集成方法。","authors":"Javier E. Flores, , , David J. Degnan, , , Yuri E. Corilo, , , Chaevien S. Clendinen, , and , Lisa M. Bramer*, ","doi":"10.1021/jasms.5c00176","DOIUrl":null,"url":null,"abstract":"<p >Quantifying the similarity between two mass spectra─a known reference mass spectrum and an unidentified sample mass spectrum─is at the heart of compound identification workflows in gas chromatography–mass spectrometry (GC-MS). The reference spectrum most like the sample is assigned as its identification (provided some quantitative similarity threshold is met, e.g., 80%) and thus accurately measuring similarity is essential. Significant research has gone toward developing metrics for this purpose, each of which has attempted to improve upon existing methods by incorporating GC-MS-specific information (e.g., peak ratios or retention times) or adopting various statistical and algorithmic frameworks. While this active development has led to a plethora of similarity metrics with demonstrated value across different contexts, the unfortunate consequence has been confusion surrounding which metric should be used as a global standard. No such metric is currently accepted as the standard method because different metrics have demonstrated optimal performance in different contexts. In this work, we propose an ensemble approach to spectral similarity scoring that combines the collective information from across existing similarity metrics to form an improved, globally representative similarity metric as a step toward establishing a global standard method. The resulting ensemble metrics are evaluated on over 88,000 spectra of varying complexity and demonstrate improved abilities to accurately rank the correct reference spectrum as the top-matching candidate for a sample relative to the rankings generated by individual similarity scores.</p>","PeriodicalId":672,"journal":{"name":"Journal of the American Society for Mass Spectrometry","volume":"36 10","pages":"2164–2170"},"PeriodicalIF":2.7000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Power of Many: An Ensemble Approach to Spectral Similarity\",\"authors\":\"Javier E. Flores, , , David J. Degnan, , , Yuri E. Corilo, , , Chaevien S. Clendinen, , and , Lisa M. Bramer*, \",\"doi\":\"10.1021/jasms.5c00176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Quantifying the similarity between two mass spectra─a known reference mass spectrum and an unidentified sample mass spectrum─is at the heart of compound identification workflows in gas chromatography–mass spectrometry (GC-MS). The reference spectrum most like the sample is assigned as its identification (provided some quantitative similarity threshold is met, e.g., 80%) and thus accurately measuring similarity is essential. Significant research has gone toward developing metrics for this purpose, each of which has attempted to improve upon existing methods by incorporating GC-MS-specific information (e.g., peak ratios or retention times) or adopting various statistical and algorithmic frameworks. While this active development has led to a plethora of similarity metrics with demonstrated value across different contexts, the unfortunate consequence has been confusion surrounding which metric should be used as a global standard. No such metric is currently accepted as the standard method because different metrics have demonstrated optimal performance in different contexts. In this work, we propose an ensemble approach to spectral similarity scoring that combines the collective information from across existing similarity metrics to form an improved, globally representative similarity metric as a step toward establishing a global standard method. The resulting ensemble metrics are evaluated on over 88,000 spectra of varying complexity and demonstrate improved abilities to accurately rank the correct reference spectrum as the top-matching candidate for a sample relative to the rankings generated by individual similarity scores.</p>\",\"PeriodicalId\":672,\"journal\":{\"name\":\"Journal of the American Society for Mass Spectrometry\",\"volume\":\"36 10\",\"pages\":\"2164–2170\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Society for Mass Spectrometry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/jasms.5c00176\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Society for Mass Spectrometry","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/jasms.5c00176","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
The Power of Many: An Ensemble Approach to Spectral Similarity
Quantifying the similarity between two mass spectra─a known reference mass spectrum and an unidentified sample mass spectrum─is at the heart of compound identification workflows in gas chromatography–mass spectrometry (GC-MS). The reference spectrum most like the sample is assigned as its identification (provided some quantitative similarity threshold is met, e.g., 80%) and thus accurately measuring similarity is essential. Significant research has gone toward developing metrics for this purpose, each of which has attempted to improve upon existing methods by incorporating GC-MS-specific information (e.g., peak ratios or retention times) or adopting various statistical and algorithmic frameworks. While this active development has led to a plethora of similarity metrics with demonstrated value across different contexts, the unfortunate consequence has been confusion surrounding which metric should be used as a global standard. No such metric is currently accepted as the standard method because different metrics have demonstrated optimal performance in different contexts. In this work, we propose an ensemble approach to spectral similarity scoring that combines the collective information from across existing similarity metrics to form an improved, globally representative similarity metric as a step toward establishing a global standard method. The resulting ensemble metrics are evaluated on over 88,000 spectra of varying complexity and demonstrate improved abilities to accurately rank the correct reference spectrum as the top-matching candidate for a sample relative to the rankings generated by individual similarity scores.
期刊介绍:
The Journal of the American Society for Mass Spectrometry presents research papers covering all aspects of mass spectrometry, incorporating coverage of fields of scientific inquiry in which mass spectrometry can play a role.
Comprehensive in scope, the journal publishes papers on both fundamentals and applications of mass spectrometry. Fundamental subjects include instrumentation principles, design, and demonstration, structures and chemical properties of gas-phase ions, studies of thermodynamic properties, ion spectroscopy, chemical kinetics, mechanisms of ionization, theories of ion fragmentation, cluster ions, and potential energy surfaces. In addition to full papers, the journal offers Communications, Application Notes, and Accounts and Perspectives