从转录组学估算整个蛋白质组的拷贝数

IF 7.7 1区生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY

Molecular Systems Biology Pub Date : 2024-11-01 Epub Date: 2024-09-27 DOI:10.1038/s44320-024-00064-3

Andrew J Sweatt, Cameron D Griffiths, Sarah M Groves, B Bishal Paudel, Lixin Wang, David F Kashatus, Kevin A Janes

{"title":"从转录组学估算整个蛋白质组的拷贝数","authors":"Andrew J Sweatt, Cameron D Griffiths, Sarah M Groves, B Bishal Paudel, Lixin Wang, David F Kashatus, Kevin A Janes","doi":"10.1038/s44320-024-00064-3","DOIUrl":null,"url":null,"abstract":"Protein copy numbers constrain systems-level properties of regulatory networks, but proportional proteomic data remain scarce compared to RNA-seq. We related mRNA to protein statistically using best-available data from quantitative proteomics and transcriptomics for 4366 genes in 369 cell lines. The approach starts with a protein's median copy number and hierarchically appends mRNA-protein and mRNA-mRNA dependencies to define an optimal gene-specific model linking mRNAs to protein. For dozens of cell lines and primary samples, these protein inferences from mRNA outmatch stringent null models, a count-based protein-abundance repository, empirical mRNA-to-protein ratios, and a proteogenomic DREAM challenge winner. The optimal mRNA-to-protein relationships capture biological processes along with hundreds of known protein-protein complexes, suggesting mechanistic relationships. We use the method to identify a viral-receptor abundance threshold for coxsackievirus B3 susceptibility from 1489 systems-biology infection models parameterized by protein inference. When applied to 796 RNA-seq profiles of breast cancer, inferred copy-number estimates collectively re-classify 26-29% of luminal tumors. By adopting a gene-centered perspective of mRNA-protein covariation across different biological contexts, we achieve accuracies comparable to the technical reproducibility of contemporary proteomics.","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"1230-1256"},"PeriodicalIF":7.7000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535397/pdf/","citationCount":"0","resultStr":"{\"title\":\"Proteome-wide copy-number estimation from transcriptomics.\",\"authors\":\"Andrew J Sweatt, Cameron D Griffiths, Sarah M Groves, B Bishal Paudel, Lixin Wang, David F Kashatus, Kevin A Janes\",\"doi\":\"10.1038/s44320-024-00064-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein copy numbers constrain systems-level properties of regulatory networks, but proportional proteomic data remain scarce compared to RNA-seq. We related mRNA to protein statistically using best-available data from quantitative proteomics and transcriptomics for 4366 genes in 369 cell lines. The approach starts with a protein's median copy number and hierarchically appends mRNA-protein and mRNA-mRNA dependencies to define an optimal gene-specific model linking mRNAs to protein. For dozens of cell lines and primary samples, these protein inferences from mRNA outmatch stringent null models, a count-based protein-abundance repository, empirical mRNA-to-protein ratios, and a proteogenomic DREAM challenge winner. The optimal mRNA-to-protein relationships capture biological processes along with hundreds of known protein-protein complexes, suggesting mechanistic relationships. We use the method to identify a viral-receptor abundance threshold for coxsackievirus B3 susceptibility from 1489 systems-biology infection models parameterized by protein inference. When applied to 796 RNA-seq profiles of breast cancer, inferred copy-number estimates collectively re-classify 26-29% of luminal tumors. By adopting a gene-centered perspective of mRNA-protein covariation across different biological contexts, we achieve accuracies comparable to the technical reproducibility of contemporary proteomics.\",\"PeriodicalId\":18906,\"journal\":{\"name\":\"Molecular Systems Biology\",\"volume\":\" \",\"pages\":\"1230-1256\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535397/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Systems Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1038/s44320-024-00064-3\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Systems Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s44320-024-00064-3","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

蛋白质拷贝数制约着调控网络的系统级特性，但与 RNA-seq 相比，比例蛋白质组数据仍然很少。我们利用定量蛋白质组学和转录组学中现有的最佳数据，对 369 个细胞系中的 4366 个基因的 mRNA 与蛋白质进行了统计关联。这种方法以蛋白质的中位拷贝数为起点，分级附加 mRNA 与蛋白质以及 mRNA 与 mRNA 之间的依赖关系，从而定义一个将 mRNA 与蛋白质联系起来的最佳基因特异性模型。对于数十种细胞系和原始样本，这些从 mRNA 推断蛋白质的方法优于严格的无效模型、基于计数的蛋白质丰度库、经验 mRNA 蛋白比率和蛋白质基因组 DREAM 挑战赛优胜者。最佳的 mRNA 与蛋白质关系捕捉到了生物过程以及数百种已知的蛋白质-蛋白质复合物，表明了机理关系。我们用这种方法从 1489 个系统生物学感染模型中找出了柯萨奇病毒 B3 易感性的病毒-受体丰度阈值，并以蛋白质推断作为参数。当应用于 796 个乳腺癌 RNA-seq 图谱时，推断出的拷贝数估计值共同对 26-29% 的管腔肿瘤进行了重新分类。通过采用以基因为中心的视角来看待不同生物背景下 mRNA 与蛋白质之间的协变关系，我们获得了与当代蛋白质组学技术可重复性相当的精确度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Proteome-wide copy-number estimation from transcriptomics.

Protein copy numbers constrain systems-level properties of regulatory networks, but proportional proteomic data remain scarce compared to RNA-seq. We related mRNA to protein statistically using best-available data from quantitative proteomics and transcriptomics for 4366 genes in 369 cell lines. The approach starts with a protein's median copy number and hierarchically appends mRNA-protein and mRNA-mRNA dependencies to define an optimal gene-specific model linking mRNAs to protein. For dozens of cell lines and primary samples, these protein inferences from mRNA outmatch stringent null models, a count-based protein-abundance repository, empirical mRNA-to-protein ratios, and a proteogenomic DREAM challenge winner. The optimal mRNA-to-protein relationships capture biological processes along with hundreds of known protein-protein complexes, suggesting mechanistic relationships. We use the method to identify a viral-receptor abundance threshold for coxsackievirus B3 susceptibility from 1489 systems-biology infection models parameterized by protein inference. When applied to 796 RNA-seq profiles of breast cancer, inferred copy-number estimates collectively re-classify 26-29% of luminal tumors. By adopting a gene-centered perspective of mRNA-protein covariation across different biological contexts, we achieve accuracies comparable to the technical reproducibility of contemporary proteomics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Molecular Systems Biology 生物-生化与分子生物学

CiteScore

18.50

自引率

1.00%

发文量

审稿时长

6-12 weeks

期刊介绍： Systems biology is a field that aims to understand complex biological systems by studying their components and how they interact. It is an integrative discipline that seeks to explain the properties and behavior of these systems. Molecular Systems Biology is a scholarly journal that publishes top-notch research in the areas of systems biology, synthetic biology, and systems medicine. It is an open access journal, meaning that its content is freely available to readers, and it is peer-reviewed to ensure the quality of the published work.