核糖体移码检测的统计方法

Alisa Yurovsky, Justin Gardin, B. Futcher, S. Skiena
{"title":"核糖体移码检测的统计方法","authors":"Alisa Yurovsky, Justin Gardin, B. Futcher, S. Skiena","doi":"10.1145/3535508.3545529","DOIUrl":null,"url":null,"abstract":"During normal protein synthesis, the ribosome shifts along the messenger RNA (mRNA) by exactly three nucleotides for each amino acid added to the protein being translated. However, in special cases, the sequence of the mRNA somehow induces the ribosome to slip, which shifts the \"reading frame\" in which the mRNA is translated, and gives rise to an otherwise unexpected protein. Such \"programmed frameshifts\" are well-known in viruses, including coronavirus, and a few cases of programmed frameshifting are also known in cellular genes. However, there is no good way, either experimental or informatic, to identify novel cases of programmed frameshifting. Thus it is possible that substantial numbers of cellular proteins generated by programmed frameshifting in human and other organisms remain unknown. Here, we build on prior works observing that data from ribosome profiling can be analyzed for anomalies in mRNA reading frame periodicity to identify putative programmed frameshifts. We develop a statistical framework to identify all likely (even for very low frameshifting rates) frameshift positions in a genome. We also develop a frameshift simulator for ribosome profiling data to verify our algorithm. We show high sensitivity of prediction on the simulated data, retrieving 97.4% of the simulated frameshifts. Furthermore, our method found all three of the known yeast genes with programmed frameshifts. Our results suggest there could be a large number of un-annotated alternative proteins in the yeast genome, generated by programmed frameshifting. This motivates further study and parallel investigations in the human genome.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Statistical methodology for ribosomal frameshift detection\",\"authors\":\"Alisa Yurovsky, Justin Gardin, B. Futcher, S. Skiena\",\"doi\":\"10.1145/3535508.3545529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"During normal protein synthesis, the ribosome shifts along the messenger RNA (mRNA) by exactly three nucleotides for each amino acid added to the protein being translated. However, in special cases, the sequence of the mRNA somehow induces the ribosome to slip, which shifts the \\\"reading frame\\\" in which the mRNA is translated, and gives rise to an otherwise unexpected protein. Such \\\"programmed frameshifts\\\" are well-known in viruses, including coronavirus, and a few cases of programmed frameshifting are also known in cellular genes. However, there is no good way, either experimental or informatic, to identify novel cases of programmed frameshifting. Thus it is possible that substantial numbers of cellular proteins generated by programmed frameshifting in human and other organisms remain unknown. Here, we build on prior works observing that data from ribosome profiling can be analyzed for anomalies in mRNA reading frame periodicity to identify putative programmed frameshifts. We develop a statistical framework to identify all likely (even for very low frameshifting rates) frameshift positions in a genome. We also develop a frameshift simulator for ribosome profiling data to verify our algorithm. We show high sensitivity of prediction on the simulated data, retrieving 97.4% of the simulated frameshifts. Furthermore, our method found all three of the known yeast genes with programmed frameshifts. Our results suggest there could be a large number of un-annotated alternative proteins in the yeast genome, generated by programmed frameshifting. This motivates further study and parallel investigations in the human genome.\",\"PeriodicalId\":354504,\"journal\":{\"name\":\"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3535508.3545529\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3535508.3545529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在正常的蛋白质合成过程中,核糖体沿着信使RNA (mRNA)移动,对于每添加到被翻译蛋白质上的氨基酸,核糖体精确地移动三个核苷酸。然而,在特殊情况下,mRNA的序列会以某种方式诱导核糖体滑动,从而改变mRNA被翻译的“阅读框”,从而产生一种意想不到的蛋白质。这种“程序性移帧”在包括冠状病毒在内的病毒中是众所周知的,在细胞基因中也有一些程序性移帧的案例。然而,没有好的方法,无论是实验或信息,以确定新的情况下的程序化帧移。因此,在人类和其他生物体中,通过程序化移框产生的大量细胞蛋白可能仍然未知。在此,我们建立在先前的工作基础上,观察到核糖体分析的数据可以分析mRNA阅读框周期性的异常,以识别假定的程序化帧移位。我们开发了一个统计框架来识别基因组中所有可能的移码位置(即使移码率非常低)。我们还为核糖体分析数据开发了移码模拟器来验证我们的算法。我们对模拟数据的预测具有很高的灵敏度,检索了97.4%的模拟帧移。此外,我们的方法发现了所有三个已知的酵母基因具有程序化的帧移。我们的研究结果表明,酵母基因组中可能存在大量未注释的替代蛋白,这些蛋白是通过程序化的移框产生的。这激发了对人类基因组的进一步研究和平行研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Statistical methodology for ribosomal frameshift detection
During normal protein synthesis, the ribosome shifts along the messenger RNA (mRNA) by exactly three nucleotides for each amino acid added to the protein being translated. However, in special cases, the sequence of the mRNA somehow induces the ribosome to slip, which shifts the "reading frame" in which the mRNA is translated, and gives rise to an otherwise unexpected protein. Such "programmed frameshifts" are well-known in viruses, including coronavirus, and a few cases of programmed frameshifting are also known in cellular genes. However, there is no good way, either experimental or informatic, to identify novel cases of programmed frameshifting. Thus it is possible that substantial numbers of cellular proteins generated by programmed frameshifting in human and other organisms remain unknown. Here, we build on prior works observing that data from ribosome profiling can be analyzed for anomalies in mRNA reading frame periodicity to identify putative programmed frameshifts. We develop a statistical framework to identify all likely (even for very low frameshifting rates) frameshift positions in a genome. We also develop a frameshift simulator for ribosome profiling data to verify our algorithm. We show high sensitivity of prediction on the simulated data, retrieving 97.4% of the simulated frameshifts. Furthermore, our method found all three of the known yeast genes with programmed frameshifts. Our results suggest there could be a large number of un-annotated alternative proteins in the yeast genome, generated by programmed frameshifting. This motivates further study and parallel investigations in the human genome.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信