In response to "On E-value for tandem MS scoring schemes"

Jainab Khatun, Morgan C. Giddings
{"title":"In response to \"On E-value for tandem MS scoring schemes\"","authors":"Jainab Khatun, Morgan C. Giddings","doi":"10.1093/bioinformatics/btn252","DOIUrl":null,"url":null,"abstract":"We thank Mark Segal for raising the issue of interpreting MS/MS scores. As he noted, we used a method proposed by Fenyo and Beavis (FB) (2003) to asses the significance of identification using HMM_Score. In his letter, Segal makes two basic assertions about this use: (1) that the extreme value distribution does not apply for the MS/MS database scoring systems used by FB and our HMM and (2) the linear tail fitting of the log survival function is not robust. He proposes a method that he authored as an alternative for estimating evd parameters that he says may be more robust, and also points to a method by Shen et al. that is specific to assessing significance of proteins/peptides identifications using MS/MS data. While it is valuable to examine whether there exist better ways of statistically interpreting the results of MS/MS search, in his letter, Segal did not provide any clear supporting evidence for his claim that the MS/MS scorers cannot use E-values. In our case, we calculate a score distribution for all random matches on-the-fly, then deriving the survival function, s, (the cumulative probability distribution) and finally, fitting a line to log of this function for the high-scoring portion of s. We verified the methodology for a series of randomly chosen HMM_Score search results, observing that in all cases, the fit had very high correlation values (R2 > 0.9). All subsequent validation of HMM_Score was performed using the E-values produced, and as reported the system performs well.","PeriodicalId":90576,"journal":{"name":"Journal of bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btn252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We thank Mark Segal for raising the issue of interpreting MS/MS scores. As he noted, we used a method proposed by Fenyo and Beavis (FB) (2003) to asses the significance of identification using HMM_Score. In his letter, Segal makes two basic assertions about this use: (1) that the extreme value distribution does not apply for the MS/MS database scoring systems used by FB and our HMM and (2) the linear tail fitting of the log survival function is not robust. He proposes a method that he authored as an alternative for estimating evd parameters that he says may be more robust, and also points to a method by Shen et al. that is specific to assessing significance of proteins/peptides identifications using MS/MS data. While it is valuable to examine whether there exist better ways of statistically interpreting the results of MS/MS search, in his letter, Segal did not provide any clear supporting evidence for his claim that the MS/MS scorers cannot use E-values. In our case, we calculate a score distribution for all random matches on-the-fly, then deriving the survival function, s, (the cumulative probability distribution) and finally, fitting a line to log of this function for the high-scoring portion of s. We verified the methodology for a series of randomly chosen HMM_Score search results, observing that in all cases, the fit had very high correlation values (R2 > 0.9). All subsequent validation of HMM_Score was performed using the E-values produced, and as reported the system performs well.
回应“关于串联质谱评分计划的e值”
我们感谢Mark Segal提出解释MS/MS分数的问题。正如他所指出的,我们使用了Fenyo和Beavis (FB)(2003)提出的方法,使用HMM_Score来评估识别的重要性。在他的信中,Segal对这种用法做出了两个基本断言:(1)极值分布不适用于FB和我们的HMM使用的MS/MS数据库评分系统;(2)对数生存函数的线性尾部拟合不是鲁棒的。他提出了自己撰写的一种方法,作为估计evd参数的替代方法,他认为这种方法可能更稳健,并指出Shen等人的一种方法,该方法专门用于使用MS/MS数据评估蛋白质/肽鉴定的重要性。虽然研究是否存在更好的统计方法来解释质谱/质谱搜索结果是有价值的,但在他的信中,西格尔没有提供任何明确的证据来支持他的说法,即质谱/质谱评分者不能使用e值。在我们的案例中,我们计算所有随机匹配的得分分布,然后推导生存函数s(累积概率分布),最后,为s的高分部分拟合该函数的对数。我们对一系列随机选择的HMM_Score搜索结果验证了该方法,观察到在所有情况下,拟合具有非常高的相关值(R2 > 0.9)。HMM_Score的所有后续验证都是使用生成的e值执行的,根据报告,系统表现良好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信