FineFDR: Fine-grained Taxonomy-specific False Discovery Rates Control in Metaproteomics.

Shengze Wang, Shichao Feng, Chongle Pan, Xuan Guo
{"title":"FineFDR: Fine-grained Taxonomy-specific False Discovery Rates Control in Metaproteomics.","authors":"Shengze Wang,&nbsp;Shichao Feng,&nbsp;Chongle Pan,&nbsp;Xuan Guo","doi":"10.1109/bibm55620.2022.9995401","DOIUrl":null,"url":null,"abstract":"<p><p>Microbial community proteomics, also termed metaproteomics, investigates all proteins expressed by a microbiota. Tandem mass spectrometry (MS/MS) is the typical method for identifying proteins in metaproteomics, which involves searching the mass spectra against a protein sequence database. A major post-analysis step is controlling the false discovery rate (FDR), i.e., the ratio of false positives to the total number of annotations. The current popular target-decoy FDR estimation method treats all the peptides and proteins equally and overlooks that they could have varied probabilities of being identified. In this study, we report FineFDR, a framework for FDR assessment at fine-grained levels with taxonomy information considered. FineFDR groups the identified peptide-spectrum matches, peptides, and proteins from different taxonomic units and estimates the FDR in each group separately. Empirical experiments on the simulated and real-world data sets demonstrate that our FineFDR achieved higher precision and more peptide and protein identifications when compared to the state-of-the-art methods, such as Comet, Percolator, TIDD, and Tailor. FineFDR is freely available under the GNU GPL license at https://github.com/Biocomputing-Research-Group/FDR.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2022 ","pages":"287-292"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9998077/pdf/nihms-1868490.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bibm55620.2022.9995401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Microbial community proteomics, also termed metaproteomics, investigates all proteins expressed by a microbiota. Tandem mass spectrometry (MS/MS) is the typical method for identifying proteins in metaproteomics, which involves searching the mass spectra against a protein sequence database. A major post-analysis step is controlling the false discovery rate (FDR), i.e., the ratio of false positives to the total number of annotations. The current popular target-decoy FDR estimation method treats all the peptides and proteins equally and overlooks that they could have varied probabilities of being identified. In this study, we report FineFDR, a framework for FDR assessment at fine-grained levels with taxonomy information considered. FineFDR groups the identified peptide-spectrum matches, peptides, and proteins from different taxonomic units and estimates the FDR in each group separately. Empirical experiments on the simulated and real-world data sets demonstrate that our FineFDR achieved higher precision and more peptide and protein identifications when compared to the state-of-the-art methods, such as Comet, Percolator, TIDD, and Tailor. FineFDR is freely available under the GNU GPL license at https://github.com/Biocomputing-Research-Group/FDR.

细粒度分类特异性错误发现率控制在元蛋白质组学。
微生物群落蛋白质组学,也称为元蛋白质组学,研究微生物群表达的所有蛋白质。串联质谱法(MS/MS)是元蛋白质组学中鉴定蛋白质的典型方法,它涉及到根据蛋白质序列数据库搜索质谱。分析后的一个主要步骤是控制错误发现率(FDR),即误报率与注释总数的比率。目前流行的目标-诱饵FDR估计方法对所有肽和蛋白质都一视同仁,忽略了它们可能具有不同的被识别概率。在这项研究中,我们报告了FineFDR,一个细粒度级别的FDR评估框架,考虑了分类信息。FineFDR将鉴定出的肽谱匹配、多肽和来自不同分类单位的蛋白质进行分组,并分别估计每组的FDR。在模拟和真实数据集上的经验实验表明,与Comet、Percolator、TIDD和Tailor等最先进的方法相比,我们的FineFDR实现了更高的精度和更多的肽和蛋白质鉴定。FineFDR在GNU GPL许可下可在https://github.com/Biocomputing-Research-Group/FDR免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信