将非规范开放阅读框注释为人类蛋白质的高质量肽证据

Eric W Deutsch, Leron W Kok, Jonathan M Mudge, Jorge Ruiz-Orera, Ivo Fierro-Monti, Zhi Sun, Jennifer G Abelin, M Mar Alba, Julie L Aspden, Ariel A Bazzini, Elspeth Bruford, Marie A Brunet, Lorenzo Calviello, Steven A Carr, Anne-Ruxandra Carvunis, Sonia Chothani, Jim Clauwaert, Kellie Dean, Pouya Faridi, Adam Frankish, Norbert Hubner, Nicholas Ingolia, Michele Magrane, Maria Jesus Martin, Thomas F Martinez, Gerben Menschaert, Uwe Ohler, Sandra Orchard, Owen Rackham, Xavier Roucou, Sarah A Slavoff, Eivind Valen, Aaron C Wacholder, Jonathan S. Weissman, Wei Wu, Zhi Xie, Jyoti Choudhary, Michal Bassani-Sternberg, Juan Antonio Vizcaino, Nicola Ternette, Robert L. Moritz, John Prensner, Sebastiaan van Heesch
{"title":"将非规范开放阅读框注释为人类蛋白质的高质量肽证据","authors":"Eric W Deutsch, Leron W Kok, Jonathan M Mudge, Jorge Ruiz-Orera, Ivo Fierro-Monti, Zhi Sun, Jennifer G Abelin, M Mar Alba, Julie L Aspden, Ariel A Bazzini, Elspeth Bruford, Marie A Brunet, Lorenzo Calviello, Steven A Carr, Anne-Ruxandra Carvunis, Sonia Chothani, Jim Clauwaert, Kellie Dean, Pouya Faridi, Adam Frankish, Norbert Hubner, Nicholas Ingolia, Michele Magrane, Maria Jesus Martin, Thomas F Martinez, Gerben Menschaert, Uwe Ohler, Sandra Orchard, Owen Rackham, Xavier Roucou, Sarah A Slavoff, Eivind Valen, Aaron C Wacholder, Jonathan S. Weissman, Wei Wu, Zhi Xie, Jyoti Choudhary, Michal Bassani-Sternberg, Juan Antonio Vizcaino, Nicola Ternette, Robert L. Moritz, John Prensner, Sebastiaan van Heesch","doi":"10.1101/2024.09.09.612016","DOIUrl":null,"url":null,"abstract":"A major scientific drive is to characterize the protein-coding genome as it provides the primary basis for the study of human health. But the fundamental question remains: what has been missed in prior genomic analyses? Over the past decade, the translation of non-canonical open reading frames (ncORFs) has been observed across human cell types and disease states, with major implications for proteomics, genomics, and clinical science. However, the impact of ncORFs has been limited by the absence of a large-scale understanding of their contribution to the human proteome. Here, we report the collaborative efforts of stakeholders in proteomics, immunopeptidomics, Ribo-seq ORF discovery, and gene annotation, to produce a consensus landscape of protein-level evidence for ncORFs. We show that at least 25% of a set of 7,264 ncORFs give rise to translated gene products, yielding over 3,000 peptides in a pan-proteome analysis encompassing 3.8 billion mass spectra from 95,520 experiments. With these data, we developed an annotation framework for ncORFs and created public tools for researchers through GENCODE and PeptideAtlas. This work will provide a platform to advance ncORF-derived proteins in biomedical discovery and, beyond humans, diverse animals and plants where ncORFs are similarly observed.","PeriodicalId":501108,"journal":{"name":"bioRxiv - Molecular Biology","volume":"39 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-quality peptide evidence for annotating non-canonical open reading frames as human proteins\",\"authors\":\"Eric W Deutsch, Leron W Kok, Jonathan M Mudge, Jorge Ruiz-Orera, Ivo Fierro-Monti, Zhi Sun, Jennifer G Abelin, M Mar Alba, Julie L Aspden, Ariel A Bazzini, Elspeth Bruford, Marie A Brunet, Lorenzo Calviello, Steven A Carr, Anne-Ruxandra Carvunis, Sonia Chothani, Jim Clauwaert, Kellie Dean, Pouya Faridi, Adam Frankish, Norbert Hubner, Nicholas Ingolia, Michele Magrane, Maria Jesus Martin, Thomas F Martinez, Gerben Menschaert, Uwe Ohler, Sandra Orchard, Owen Rackham, Xavier Roucou, Sarah A Slavoff, Eivind Valen, Aaron C Wacholder, Jonathan S. Weissman, Wei Wu, Zhi Xie, Jyoti Choudhary, Michal Bassani-Sternberg, Juan Antonio Vizcaino, Nicola Ternette, Robert L. Moritz, John Prensner, Sebastiaan van Heesch\",\"doi\":\"10.1101/2024.09.09.612016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A major scientific drive is to characterize the protein-coding genome as it provides the primary basis for the study of human health. But the fundamental question remains: what has been missed in prior genomic analyses? Over the past decade, the translation of non-canonical open reading frames (ncORFs) has been observed across human cell types and disease states, with major implications for proteomics, genomics, and clinical science. However, the impact of ncORFs has been limited by the absence of a large-scale understanding of their contribution to the human proteome. Here, we report the collaborative efforts of stakeholders in proteomics, immunopeptidomics, Ribo-seq ORF discovery, and gene annotation, to produce a consensus landscape of protein-level evidence for ncORFs. We show that at least 25% of a set of 7,264 ncORFs give rise to translated gene products, yielding over 3,000 peptides in a pan-proteome analysis encompassing 3.8 billion mass spectra from 95,520 experiments. With these data, we developed an annotation framework for ncORFs and created public tools for researchers through GENCODE and PeptideAtlas. This work will provide a platform to advance ncORF-derived proteins in biomedical discovery and, beyond humans, diverse animals and plants where ncORFs are similarly observed.\",\"PeriodicalId\":501108,\"journal\":{\"name\":\"bioRxiv - Molecular Biology\",\"volume\":\"39 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Molecular Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.09.09.612016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Molecular Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.09.612016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质编码基因组是研究人类健康的主要基础,因此研究蛋白质编码基因组的特征是科学研究的主要动力。但根本问题仍然是:之前的基因组分析遗漏了什么?在过去的十年中,人们在各种人类细胞类型和疾病状态中观察到了非规范开放阅读框(ncORFs)的翻译,这对蛋白质组学、基因组学和临床科学产生了重大影响。然而,由于缺乏对 ncORFs 对人类蛋白质组贡献的大规模了解,ncORFs 的影响一直受到限制。在这里,我们报告了蛋白质组学、免疫肽组学、Ribo-seq ORF 发现和基因注释等领域的相关人员共同努力的结果,从而得出了 ncORFs 蛋白水平证据的共识图谱。我们的研究表明,在一组 7,264 个 ncORFs 中,至少有 25% 的 ncORFs 产生了翻译基因产物,在一项泛蛋白质组分析中,来自 95,520 次实验的 38 亿条质谱产生了超过 3,000 个肽段。利用这些数据,我们开发了 ncORFs 注释框架,并通过 GENCODE 和 PeptideAtlas 为研究人员创建了公共工具。这项工作将提供一个平台,推动生物医学发现中的 ncORF 衍生蛋白质,以及在人类之外观察到类似 ncORF 的各种动物和植物。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
High-quality peptide evidence for annotating non-canonical open reading frames as human proteins
A major scientific drive is to characterize the protein-coding genome as it provides the primary basis for the study of human health. But the fundamental question remains: what has been missed in prior genomic analyses? Over the past decade, the translation of non-canonical open reading frames (ncORFs) has been observed across human cell types and disease states, with major implications for proteomics, genomics, and clinical science. However, the impact of ncORFs has been limited by the absence of a large-scale understanding of their contribution to the human proteome. Here, we report the collaborative efforts of stakeholders in proteomics, immunopeptidomics, Ribo-seq ORF discovery, and gene annotation, to produce a consensus landscape of protein-level evidence for ncORFs. We show that at least 25% of a set of 7,264 ncORFs give rise to translated gene products, yielding over 3,000 peptides in a pan-proteome analysis encompassing 3.8 billion mass spectra from 95,520 experiments. With these data, we developed an annotation framework for ncORFs and created public tools for researchers through GENCODE and PeptideAtlas. This work will provide a platform to advance ncORF-derived proteins in biomedical discovery and, beyond humans, diverse animals and plants where ncORFs are similarly observed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信