Jack Freestone, Lukas Käll, William Stafford Noble, Uri Keich
{"title":"How to Train a Postprocessor for Tandem Mass Spectrometry Proteomics Database Search While Maintaining Control of the False Discovery Rate.","authors":"Jack Freestone, Lukas Käll, William Stafford Noble, Uri Keich","doi":"10.1021/acs.jproteome.4c00742","DOIUrl":null,"url":null,"abstract":"<p><p>Decoy-based methods are a popular choice for the statistical validation of peptide detection in tandem mass spectrometry and proteomics data. Such methods can achieve a substantial boost in statistical power when coupled with postprocessors such as Percolator that use auxiliary features to learn a better-discriminating scoring function. However, we recently showed that Percolator can struggle to control the false discovery rate (FDR) when reporting the list of discovered peptides. To address this problem, we introduce Percolator-RESET, which is an adaptation of our recently developed RESET meta-procedure to the peptide detection problem. Specifically, Percolator-RESET fuses Percolator's iterative SVM training procedure with RESET's general framework to provide valid false discovery rate control. Percolator-RESET operates in both a standard single-decoy mode and a two-decoy mode, with the latter requiring the generation of two decoys per target. We demonstrate that Percolator-RESET controls the FDR in both modes, both theoretically and empirically, while typically reporting only a marginally smaller number of discoveries than Percolator in the single-decoy mode. The two-decoy mode is marginally more powerful than both Percolator and the single-decoy mode and exhibits less variability than the latter.</p>","PeriodicalId":48,"journal":{"name":"Journal of Proteome Research","volume":" ","pages":"2266-2279"},"PeriodicalIF":3.8000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Proteome Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1021/acs.jproteome.4c00742","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/31 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Decoy-based methods are a popular choice for the statistical validation of peptide detection in tandem mass spectrometry and proteomics data. Such methods can achieve a substantial boost in statistical power when coupled with postprocessors such as Percolator that use auxiliary features to learn a better-discriminating scoring function. However, we recently showed that Percolator can struggle to control the false discovery rate (FDR) when reporting the list of discovered peptides. To address this problem, we introduce Percolator-RESET, which is an adaptation of our recently developed RESET meta-procedure to the peptide detection problem. Specifically, Percolator-RESET fuses Percolator's iterative SVM training procedure with RESET's general framework to provide valid false discovery rate control. Percolator-RESET operates in both a standard single-decoy mode and a two-decoy mode, with the latter requiring the generation of two decoys per target. We demonstrate that Percolator-RESET controls the FDR in both modes, both theoretically and empirically, while typically reporting only a marginally smaller number of discoveries than Percolator in the single-decoy mode. The two-decoy mode is marginally more powerful than both Percolator and the single-decoy mode and exhibits less variability than the latter.
期刊介绍:
Journal of Proteome Research publishes content encompassing all aspects of global protein analysis and function, including the dynamic aspects of genomics, spatio-temporal proteomics, metabonomics and metabolomics, clinical and agricultural proteomics, as well as advances in methodology including bioinformatics. The theme and emphasis is on a multidisciplinary approach to the life sciences through the synergy between the different types of "omics".