CrypticProteinDB: an integrated database of proteome and immunopeptidome derived non-canonical cancer proteins.

NAR Cancer Pub Date : 2023-06-01 DOI:10.1093/narcan/zcad024

Ghofran Othoum, Christopher A Maher

{"title":"CrypticProteinDB: an integrated database of proteome and immunopeptidome derived non-canonical cancer proteins.","authors":"Ghofran Othoum, Christopher A Maher","doi":"10.1093/narcan/zcad024","DOIUrl":null,"url":null,"abstract":"<p><p>Translated non-canonical proteins derived from noncoding regions or alternative open reading frames (ORFs) can contribute to critical and diverse cellular processes. In the context of cancer, they also represent an under-appreciated source of targets for cancer immunotherapy through their tumor-enriched expression or by harboring somatic mutations that produce neoantigens. Here, we introduce the largest integration and proteogenomic analysis of novel peptides to assess the prevalence of non-canonical ORFs (ncORFs) in more than 900 patient proteomes and 26 immunopeptidome datasets across 14 cancer types. The integrative proteogenomic analysis of whole-cell proteomes and immunopeptidomes revealed peptide support for a nonredundant set of 9760 upstream, downstream, and out-of-frame ncORFs in protein coding genes and 12811 in noncoding RNAs. Notably, 6486 ncORFs were derived from differentially expressed genes and 340 were ubiquitously translated across eight or more cancers. The analysis also led to the discovery of thirty-four epitopes and eight neoantigens from non-canonical proteins in two cohorts as novel cancer immunotargets. Collectively, our analysis integrated both bottom-up proteogenomic and targeted peptide validation to illustrate the prevalence of translated non-canonical proteins in cancer and to provide a resource for the prioritization of novel proteins supported by proteomic, immunopeptidomic, genomic and transcriptomic data, available at https://www.maherlab.com/crypticproteindb.</p>","PeriodicalId":18879,"journal":{"name":"NAR Cancer","volume":"5 2","pages":"zcad024"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10233886/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Cancer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/narcan/zcad024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Translated non-canonical proteins derived from noncoding regions or alternative open reading frames (ORFs) can contribute to critical and diverse cellular processes. In the context of cancer, they also represent an under-appreciated source of targets for cancer immunotherapy through their tumor-enriched expression or by harboring somatic mutations that produce neoantigens. Here, we introduce the largest integration and proteogenomic analysis of novel peptides to assess the prevalence of non-canonical ORFs (ncORFs) in more than 900 patient proteomes and 26 immunopeptidome datasets across 14 cancer types. The integrative proteogenomic analysis of whole-cell proteomes and immunopeptidomes revealed peptide support for a nonredundant set of 9760 upstream, downstream, and out-of-frame ncORFs in protein coding genes and 12811 in noncoding RNAs. Notably, 6486 ncORFs were derived from differentially expressed genes and 340 were ubiquitously translated across eight or more cancers. The analysis also led to the discovery of thirty-four epitopes and eight neoantigens from non-canonical proteins in two cohorts as novel cancer immunotargets. Collectively, our analysis integrated both bottom-up proteogenomic and targeted peptide validation to illustrate the prevalence of translated non-canonical proteins in cancer and to provide a resource for the prioritization of novel proteins supported by proteomic, immunopeptidomic, genomic and transcriptomic data, available at https://www.maherlab.com/crypticproteindb.

Abstract Image

查看原文本刊更多论文

CrypticProteinDB：一个蛋白质组和免疫肽来源的非匿名癌症蛋白质的综合数据库。

衍生自非编码区或替代开放阅读框（ORF）的翻译非经典蛋白可以促进关键和多样的细胞过程。在癌症的背景下，它们还通过富集肿瘤的表达或携带产生新抗原的体细胞突变，代表了癌症免疫疗法靶点的低估来源。在此，我们介绍了新肽的最大整合和蛋白基因组分析，以评估非匿名ORF（ncORF）在14种癌症类型的900多个患者蛋白质组和26个免疫肽数据集中的患病率。全细胞蛋白质组和免疫肽的综合蛋白基因组分析显示，肽支持蛋白质编码基因中的9760个上游、下游和框架外ncORF和非编码RNA中的12811个非冗余序列。值得注意的是，6486个ncORF来源于差异表达基因，340个在八种或多种癌症中普遍翻译。该分析还发现了34个表位和8个新抗原，它们来自两个队列中的非匿名蛋白，作为新的癌症免疫靶点。总之，我们的分析综合了自下而上的蛋白基因组学和靶向肽验证，以说明翻译的非匿名蛋白在癌症中的流行情况，并为蛋白质组学、免疫肽组学、基因组学和转录组学数据支持的新蛋白的优先顺序提供资源，可在https://www.maherlab.com/crypticproteindb.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

NAR Cancer

自引率

0.00%

发文量