{"title":"回应“关于串联质谱评分计划的e值”","authors":"Jainab Khatun, Morgan C. Giddings","doi":"10.1093/bioinformatics/btn252","DOIUrl":null,"url":null,"abstract":"We thank Mark Segal for raising the issue of interpreting MS/MS scores. As he noted, we used a method proposed by Fenyo and Beavis (FB) (2003) to asses the significance of identification using HMM_Score. In his letter, Segal makes two basic assertions about this use: (1) that the extreme value distribution does not apply for the MS/MS database scoring systems used by FB and our HMM and (2) the linear tail fitting of the log survival function is not robust. He proposes a method that he authored as an alternative for estimating evd parameters that he says may be more robust, and also points to a method by Shen et al. that is specific to assessing significance of proteins/peptides identifications using MS/MS data. While it is valuable to examine whether there exist better ways of statistically interpreting the results of MS/MS search, in his letter, Segal did not provide any clear supporting evidence for his claim that the MS/MS scorers cannot use E-values. In our case, we calculate a score distribution for all random matches on-the-fly, then deriving the survival function, s, (the cumulative probability distribution) and finally, fitting a line to log of this function for the high-scoring portion of s. We verified the methodology for a series of randomly chosen HMM_Score search results, observing that in all cases, the fit had very high correlation values (R2 > 0.9). All subsequent validation of HMM_Score was performed using the E-values produced, and as reported the system performs well.","PeriodicalId":90576,"journal":{"name":"Journal of bioinformatics","volume":"1 1","pages":"1654"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"In response to \\\"On E-value for tandem MS scoring schemes\\\"\",\"authors\":\"Jainab Khatun, Morgan C. Giddings\",\"doi\":\"10.1093/bioinformatics/btn252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We thank Mark Segal for raising the issue of interpreting MS/MS scores. As he noted, we used a method proposed by Fenyo and Beavis (FB) (2003) to asses the significance of identification using HMM_Score. In his letter, Segal makes two basic assertions about this use: (1) that the extreme value distribution does not apply for the MS/MS database scoring systems used by FB and our HMM and (2) the linear tail fitting of the log survival function is not robust. He proposes a method that he authored as an alternative for estimating evd parameters that he says may be more robust, and also points to a method by Shen et al. that is specific to assessing significance of proteins/peptides identifications using MS/MS data. While it is valuable to examine whether there exist better ways of statistically interpreting the results of MS/MS search, in his letter, Segal did not provide any clear supporting evidence for his claim that the MS/MS scorers cannot use E-values. In our case, we calculate a score distribution for all random matches on-the-fly, then deriving the survival function, s, (the cumulative probability distribution) and finally, fitting a line to log of this function for the high-scoring portion of s. We verified the methodology for a series of randomly chosen HMM_Score search results, observing that in all cases, the fit had very high correlation values (R2 > 0.9). All subsequent validation of HMM_Score was performed using the E-values produced, and as reported the system performs well.\",\"PeriodicalId\":90576,\"journal\":{\"name\":\"Journal of bioinformatics\",\"volume\":\"1 1\",\"pages\":\"1654\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btn252\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btn252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In response to "On E-value for tandem MS scoring schemes"
We thank Mark Segal for raising the issue of interpreting MS/MS scores. As he noted, we used a method proposed by Fenyo and Beavis (FB) (2003) to asses the significance of identification using HMM_Score. In his letter, Segal makes two basic assertions about this use: (1) that the extreme value distribution does not apply for the MS/MS database scoring systems used by FB and our HMM and (2) the linear tail fitting of the log survival function is not robust. He proposes a method that he authored as an alternative for estimating evd parameters that he says may be more robust, and also points to a method by Shen et al. that is specific to assessing significance of proteins/peptides identifications using MS/MS data. While it is valuable to examine whether there exist better ways of statistically interpreting the results of MS/MS search, in his letter, Segal did not provide any clear supporting evidence for his claim that the MS/MS scorers cannot use E-values. In our case, we calculate a score distribution for all random matches on-the-fly, then deriving the survival function, s, (the cumulative probability distribution) and finally, fitting a line to log of this function for the high-scoring portion of s. We verified the methodology for a series of randomly chosen HMM_Score search results, observing that in all cases, the fit had very high correlation values (R2 > 0.9). All subsequent validation of HMM_Score was performed using the E-values produced, and as reported the system performs well.