发现微蛋白:充分利用核糖体分析数据。

IF 3.6 3区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
RNA Biology Pub Date : 2023-01-01 Epub Date: 2023-11-27 DOI:10.1080/15476286.2023.2279845
Sonia Chothani, Lena Ho, Sebastian Schafer, Owen Rackham
{"title":"发现微蛋白:充分利用核糖体分析数据。","authors":"Sonia Chothani, Lena Ho, Sebastian Schafer, Owen Rackham","doi":"10.1080/15476286.2023.2279845","DOIUrl":null,"url":null,"abstract":"<p><p>Building a reference set of protein-coding open reading frames (ORFs) has revolutionized biological process discovery and understanding. Traditionally, gene models have been confirmed using cDNA sequencing and encoded translated regions inferred using sequence-based detection of start and stop combinations longer than 100 amino-acids to prevent false positives. This has led to small ORFs (smORFs) and their encoded proteins left un-annotated. Ribo-seq allows deciphering translated regions from untranslated irrespective of the length. In this review, we describe the power of Ribo-seq data in detection of smORFs while discussing the major challenge posed by data-quality, -depth and -sparseness in identifying the start and end of smORF translation. In particular, we outline smORF cataloguing efforts in humans and the large differences that have arisen due to variation in data, methods and assumptions. Although current versions of smORF reference sets can already be used as a powerful tool for hypothesis generation, we recommend that future editions should consider these data limitations and adopt unified processing for the community to establish a canonical catalogue of translated smORFs.</p>","PeriodicalId":21351,"journal":{"name":"RNA Biology","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10730196/pdf/","citationCount":"0","resultStr":"{\"title\":\"Discovering microproteins: making the most of ribosome profiling data.\",\"authors\":\"Sonia Chothani, Lena Ho, Sebastian Schafer, Owen Rackham\",\"doi\":\"10.1080/15476286.2023.2279845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Building a reference set of protein-coding open reading frames (ORFs) has revolutionized biological process discovery and understanding. Traditionally, gene models have been confirmed using cDNA sequencing and encoded translated regions inferred using sequence-based detection of start and stop combinations longer than 100 amino-acids to prevent false positives. This has led to small ORFs (smORFs) and their encoded proteins left un-annotated. Ribo-seq allows deciphering translated regions from untranslated irrespective of the length. In this review, we describe the power of Ribo-seq data in detection of smORFs while discussing the major challenge posed by data-quality, -depth and -sparseness in identifying the start and end of smORF translation. In particular, we outline smORF cataloguing efforts in humans and the large differences that have arisen due to variation in data, methods and assumptions. Although current versions of smORF reference sets can already be used as a powerful tool for hypothesis generation, we recommend that future editions should consider these data limitations and adopt unified processing for the community to establish a canonical catalogue of translated smORFs.</p>\",\"PeriodicalId\":21351,\"journal\":{\"name\":\"RNA Biology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10730196/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"RNA Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1080/15476286.2023.2279845\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/11/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"RNA Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1080/15476286.2023.2279845","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/27 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

构建一套蛋白质编码开放阅读框架(orf)的参考集已经彻底改变了生物学过程的发现和理解。传统上,基因模型是通过cDNA测序来确认的,编码的翻译区域是通过基于序列的检测来推断的,该检测的起始和终止组合长度超过100个氨基酸,以防止假阳性。这导致小orf (smorf)及其编码蛋白未被注释。无论长度如何,Ribo-seq都可以从未翻译的区域中破译翻译区域。在这篇综述中,我们描述了Ribo-seq数据在检测smORF方面的能力,同时讨论了在识别smORF翻译的开始和结束时,数据质量、深度和稀疏性带来的主要挑战。我们特别概述了人类smORF编目工作,以及由于数据、方法和假设的差异而产生的巨大差异。虽然当前版本的smORF参考集已经可以作为假设生成的强大工具,但我们建议未来的版本应该考虑这些数据限制,并为社区采用统一的处理方法来建立翻译smORF的规范目录。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Discovering microproteins: making the most of ribosome profiling data.

Building a reference set of protein-coding open reading frames (ORFs) has revolutionized biological process discovery and understanding. Traditionally, gene models have been confirmed using cDNA sequencing and encoded translated regions inferred using sequence-based detection of start and stop combinations longer than 100 amino-acids to prevent false positives. This has led to small ORFs (smORFs) and their encoded proteins left un-annotated. Ribo-seq allows deciphering translated regions from untranslated irrespective of the length. In this review, we describe the power of Ribo-seq data in detection of smORFs while discussing the major challenge posed by data-quality, -depth and -sparseness in identifying the start and end of smORF translation. In particular, we outline smORF cataloguing efforts in humans and the large differences that have arisen due to variation in data, methods and assumptions. Although current versions of smORF reference sets can already be used as a powerful tool for hypothesis generation, we recommend that future editions should consider these data limitations and adopt unified processing for the community to establish a canonical catalogue of translated smORFs.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
RNA Biology
RNA Biology 生物-生化与分子生物学
CiteScore
8.60
自引率
0.00%
发文量
82
审稿时长
1 months
期刊介绍: RNA has played a central role in all cellular processes since the beginning of life: decoding the genome, regulating gene expression, mediating molecular interactions, catalyzing chemical reactions. RNA Biology, as a leading journal in the field, provides a platform for presenting and discussing cutting-edge RNA research. RNA Biology brings together a multidisciplinary community of scientists working in the areas of: Transcription and splicing Post-transcriptional regulation of gene expression Non-coding RNAs RNA localization Translation and catalysis by RNA Structural biology Bioinformatics RNA in disease and therapy
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信