发现多蛋白复杂蛋白质组学数据集的真实关联率。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI:10.1109/csb.2005.29

Changyu Shen, Lang Li, Jake Yue Chen

{"title":"发现多蛋白复杂蛋白质组学数据集的真实关联率。","authors":"Changyu Shen, Lang Li, Jake Yue Chen","doi":"10.1109/csb.2005.29","DOIUrl":null,"url":null,"abstract":"Experimental processes to collect and process proteomics data are increasingly complex, while the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed a complete empirical Bayes model to analyze multi-protein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Our model considers not only bait-prey associations, but also prey-prey associations missed in previous work. Using our model and a yeast MPC proteomics data set, we estimated that there should be an average of 28 true associations per MPC, almost ten times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"167-74"},"PeriodicalIF":0.0000,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.29","citationCount":"2","resultStr":"{\"title\":\"Discover true association rates in multi-protein complex proteomics data sets.\",\"authors\":\"Changyu Shen, Lang Li, Jake Yue Chen\",\"doi\":\"10.1109/csb.2005.29\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Experimental processes to collect and process proteomics data are increasingly complex, while the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed a complete empirical Bayes model to analyze multi-protein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Our model considers not only bait-prey associations, but also prey-prey associations missed in previous work. Using our model and a yeast MPC proteomics data set, we estimated that there should be an average of 28 true associations per MPC, almost ten times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%.\",\"PeriodicalId\":87417,\"journal\":{\"name\":\"Proceedings. IEEE Computational Systems Bioinformatics Conference\",\"volume\":\" \",\"pages\":\"167-74\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/csb.2005.29\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE Computational Systems Bioinformatics Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/csb.2005.29\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computational Systems Bioinformatics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/csb.2005.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

收集和处理蛋白质组学数据的实验过程越来越复杂，而评估这些数据质量和意义的计算方法仍然不成熟。这些挑战导致了许多生物学上的疏忽和计算上的误解。我们建立了一个完整的经验贝叶斯模型来分析多蛋白复合物(MPC)蛋白质组学数据，这些数据来自纯化蛋白复合物的肽质谱检测下拉实验。我们的模型不仅考虑了诱饵-猎物关联，而且考虑了先前工作中遗漏的猎物-猎物关联。使用我们的模型和酵母MPC蛋白质组学数据集，我们估计每个MPC平均应该有28个真正的关联，几乎是之前估计的10倍。对于模拟真实蛋白质组生成的数据集，我们的模型在检测真实关联方面实现了平均80%的灵敏度，而之前工作的灵敏度为3%，同时保持了类似的0.3%的错误发现率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Discover true association rates in multi-protein complex proteomics data sets.

Experimental processes to collect and process proteomics data are increasingly complex, while the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed a complete empirical Bayes model to analyze multi-protein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Our model considers not only bait-prey associations, but also prey-prey associations missed in previous work. Using our model and a yeast MPC proteomics data set, we estimated that there should be an average of 28 true associations per MPC, almost ten times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings. IEEE Computational Systems Bioinformatics Conference

自引率

0.00%

发文量