moPepGen: Rapid and Comprehensive Identification of Non-canonical Peptides.

bioRxiv : the preprint server for biology Pub Date : 2024-11-05 DOI:10.1101/2024.03.28.587261

Chenghao Zhu, Lydia Y Liu, Annie Ha, Takafumi N Yamaguchi, Helen Zhu, Rupert Hugh-White, Julie Livingstone, Yash Patel, Thomas Kislinger, Paul C Boutros

{"title":"moPepGen: Rapid and Comprehensive Identification of Non-canonical Peptides.","authors":"Chenghao Zhu, Lydia Y Liu, Annie Ha, Takafumi N Yamaguchi, Helen Zhu, Rupert Hugh-White, Julie Livingstone, Yash Patel, Thomas Kislinger, Paul C Boutros","doi":"10.1101/2024.03.28.587261","DOIUrl":null,"url":null,"abstract":"<p><p>Gene expression is a multi-step transformation of biological information from its storage form (DNA) into functional forms (protein and some RNAs). Regulatory activities at each step of this transformation multiply a single gene into a myriad of proteoforms. Proteogenomics is the study of how genomic and transcriptomic variation creates this proteomic diversity, and is limited by the challenges of modeling the complexities of gene-expression. We therefore created moPepGen, a graph-based algorithm that comprehensively generates non-canonical peptides in linear time. moPepGen works with multiple technologies, in multiple species and on all types of genetic and transcriptomic data. In human cancer proteomes, it enumerates previously unobservable noncanonical peptides arising from germline and somatic genomic variants, noncoding open reading frames, RNA fusions and RNA circularization. By enabling efficient detection and quantitation of previously hidden proteins in both existing and new proteomic data, moPepGen facilitates all proteogenomics applications. It is available at: https://github.com/uclahs-cds/package-moPepGen.</p>","PeriodicalId":72407,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10996593/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.03.28.587261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Gene expression is a multi-step transformation of biological information from its storage form (DNA) into functional forms (protein and some RNAs). Regulatory activities at each step of this transformation multiply a single gene into a myriad of proteoforms. Proteogenomics is the study of how genomic and transcriptomic variation creates this proteomic diversity, and is limited by the challenges of modeling the complexities of gene-expression. We therefore created moPepGen, a graph-based algorithm that comprehensively generates non-canonical peptides in linear time. moPepGen works with multiple technologies, in multiple species and on all types of genetic and transcriptomic data. In human cancer proteomes, it enumerates previously unobservable noncanonical peptides arising from germline and somatic genomic variants, noncoding open reading frames, RNA fusions and RNA circularization. By enabling efficient detection and quantitation of previously hidden proteins in both existing and new proteomic data, moPepGen facilitates all proteogenomics applications. It is available at: https://github.com/uclahs-cds/package-moPepGen.

查看原文本刊更多论文

moPepGen：快速、全面的蛋白质形式鉴定。

基因表达是生物信息从储存形式（DNA）到功能形式（蛋白质和某些 RNA）的多步骤转化。在这一转化过程中，每一步的调控活动都会将单个基因倍增为无数种蛋白形式。蛋白基因组学研究的是基因组和转录组的变异如何产生这种蛋白形式的多样性，它受到基因表达复杂性建模挑战的限制。因此，我们创建了moPepGen，这是一种基于图的算法，能在线性时间内全面枚举蛋白形式。moPepGen可与多种技术结合使用，适用于多个物种以及所有类型的基因和转录组数据。在人类癌症蛋白质组中，它能检测和量化因种系和体细胞基因组变异、非编码开放阅读框、RNA融合和RNA环化而产生的以前未观察到的非典型多肽。moPepGen 能在现有的和新的蛋白质组数据中有效地识别和定量以前隐藏的蛋白质，从而为所有蛋白质组学应用提供便利。请访问：https://github.com/uclahs-cds/package-moPepGen。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

bioRxiv : the preprint server for biology

自引率

0.00%

发文量