FineFDR: Fine-grained Taxonomy-specific False Discovery Rates Control in Metaproteomics.

Proceedings. IEEE International Conference on Bioinformatics and Biomedicine Pub Date : 2022-12-01 DOI:10.1109/bibm55620.2022.9995401

Shengze Wang, Shichao Feng, Chongle Pan, Xuan Guo

{"title":"FineFDR: Fine-grained Taxonomy-specific False Discovery Rates Control in Metaproteomics.","authors":"Shengze Wang, Shichao Feng, Chongle Pan, Xuan Guo","doi":"10.1109/bibm55620.2022.9995401","DOIUrl":null,"url":null,"abstract":"<p><p>Microbial community proteomics, also termed metaproteomics, investigates all proteins expressed by a microbiota. Tandem mass spectrometry (MS/MS) is the typical method for identifying proteins in metaproteomics, which involves searching the mass spectra against a protein sequence database. A major post-analysis step is controlling the false discovery rate (FDR), i.e., the ratio of false positives to the total number of annotations. The current popular target-decoy FDR estimation method treats all the peptides and proteins equally and overlooks that they could have varied probabilities of being identified. In this study, we report FineFDR, a framework for FDR assessment at fine-grained levels with taxonomy information considered. FineFDR groups the identified peptide-spectrum matches, peptides, and proteins from different taxonomic units and estimates the FDR in each group separately. Empirical experiments on the simulated and real-world data sets demonstrate that our FineFDR achieved higher precision and more peptide and protein identifications when compared to the state-of-the-art methods, such as Comet, Percolator, TIDD, and Tailor. FineFDR is freely available under the GNU GPL license at https://github.com/Biocomputing-Research-Group/FDR.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2022 ","pages":"287-292"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9998077/pdf/nihms-1868490.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bibm55620.2022.9995401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Microbial community proteomics, also termed metaproteomics, investigates all proteins expressed by a microbiota. Tandem mass spectrometry (MS/MS) is the typical method for identifying proteins in metaproteomics, which involves searching the mass spectra against a protein sequence database. A major post-analysis step is controlling the false discovery rate (FDR), i.e., the ratio of false positives to the total number of annotations. The current popular target-decoy FDR estimation method treats all the peptides and proteins equally and overlooks that they could have varied probabilities of being identified. In this study, we report FineFDR, a framework for FDR assessment at fine-grained levels with taxonomy information considered. FineFDR groups the identified peptide-spectrum matches, peptides, and proteins from different taxonomic units and estimates the FDR in each group separately. Empirical experiments on the simulated and real-world data sets demonstrate that our FineFDR achieved higher precision and more peptide and protein identifications when compared to the state-of-the-art methods, such as Comet, Percolator, TIDD, and Tailor. FineFDR is freely available under the GNU GPL license at https://github.com/Biocomputing-Research-Group/FDR.

查看原文本刊更多论文

细粒度分类特异性错误发现率控制在元蛋白质组学。

微生物群落蛋白质组学，也称为元蛋白质组学，研究微生物群表达的所有蛋白质。串联质谱法(MS/MS)是元蛋白质组学中鉴定蛋白质的典型方法，它涉及到根据蛋白质序列数据库搜索质谱。分析后的一个主要步骤是控制错误发现率(FDR)，即误报率与注释总数的比率。目前流行的目标-诱饵FDR估计方法对所有肽和蛋白质都一视同仁，忽略了它们可能具有不同的被识别概率。在这项研究中，我们报告了FineFDR，一个细粒度级别的FDR评估框架，考虑了分类信息。FineFDR将鉴定出的肽谱匹配、多肽和来自不同分类单位的蛋白质进行分组，并分别估计每组的FDR。在模拟和真实数据集上的经验实验表明，与Comet、Percolator、TIDD和Tailor等最先进的方法相比，我们的FineFDR实现了更高的精度和更多的肽和蛋白质鉴定。FineFDR在GNU GPL许可下可在https://github.com/Biocomputing-Research-Group/FDR免费获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. IEEE International Conference on Bioinformatics and Biomedicine

自引率

0.00%

发文量