Modeling Global and local Codon Bias with Deep Language Models

2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE) Pub Date : 2017-10-01 DOI:10.1109/BIBE.2017.00-63

M. Fujimoto, P. Bodily, Cole A. Lyman, Andrew J. Jacobsen, Q. Snell, M. Clement

引用次数: 3

Abstract

Codon bias, the usage patterns of synonymous codons for encoding a protein sequence as nucleotides, is a biological phenomenon that is not fully understood. Several methods exist to represent the codon bias of an organism: codon adaptation index (CAI) [1], individual codon usage (ICU), hidden stop codons (HSC) [2] and codon context (CC) [3]. These methods are often employed in the optimization of heterologous gene expression to increase the accuracy and rate of translation. They, however, have many shortcomings as they dont take into account the local and global context of a gene. We present a method for modeling global and local codon bias through deep language models that is more robust than current methods by providing more contextual information and long-range dependencies.

查看原文本刊更多论文

用深度语言模型建模全局和局部密码子偏差

密码子偏差，即编码核苷酸蛋白质序列的同义密码子的使用模式，是一种尚未完全理解的生物学现象。目前存在几种表征生物体密码子偏好的方法:密码子适应指数(CAI)[1]、个体密码子使用(ICU)、隐藏停止密码子(HSC)[2]和密码子上下文(CC)[3]。这些方法常用于外源基因表达的优化，以提高翻译的准确性和翻译率。然而，它们有许多缺点，因为它们没有考虑到基因的本地和全球背景。我们提出了一种通过深度语言模型来建模全局和局部密码子偏差的方法，该方法通过提供更多的上下文信息和远程依赖关系，比现有方法更具鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)

自引率

0.00%

发文量