通过密码子的使用来识别循环代码。

IF 2 4区 生物学 Q2 BIOLOGY
Christian J. Michel
{"title":"通过密码子的使用来识别循环代码。","authors":"Christian J. Michel","doi":"10.1016/j.biosystems.2024.105308","DOIUrl":null,"url":null,"abstract":"<div><p>Since 1996, circular codes in genes have been identified thanks to the development of 6 statistical approaches: trinucleotide frequencies per frame (Arquès and Michel, 1996), correlation functions per frame (Arquès and Michel, 1997), frame permuted trinucleotide frequencies (Frey and Michel, 2003, 2006), advanced statistical functions at the gene population level (Michel, 2015) and at the gene level (Michel, 2017). All these 3-frame statistical methods analyse the trinucleotide information in the 3 frames of genes: the reading frame and the 2 shifted frames. Notably, codon usage does not allow for the identification of circular codes (Michel, 2020). This has been a long-standing problem since 1996, hindering biologists’ access to circular code theory.</p><p>By considering circular code conditions resulting from code theory, particularly the concept of permutation class, and building upon previous statistical work, a new statistical approach based solely on the codon usage, i.e. a 1-frame statistical method, surprisingly reveals the maximal <span><math><msup><mrow><mi>C</mi></mrow><mrow><mn>3</mn></mrow></msup></math></span> self-complementary trinucleotide circular code <span><math><mi>X</mi></math></span> in bacterial genes and in average (bacterial, archaeal, eukaryotic) genes, and almost in archaeal genes. Additionally, a new parameter definition indicates that bacterial and archaeal genes exhibit codon usage dispersion of the same order of magnitude, but significantly higher than that observed in eukaryotic genes. This statistical finding may explain the greater variability of codes in eukaryotic genes compared to bacterial and archaeal genes, an issue that has been open for many years. Finally, biologists can now search for new (variant) circular codes at both the genome level (across all genes in a given genome) and the gene level using only codon usage, without the need for analysing the shifted frames.</p></div>","PeriodicalId":50730,"journal":{"name":"Biosystems","volume":"244 ","pages":"Article 105308"},"PeriodicalIF":2.0000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Circular code identified by the codon usage\",\"authors\":\"Christian J. Michel\",\"doi\":\"10.1016/j.biosystems.2024.105308\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Since 1996, circular codes in genes have been identified thanks to the development of 6 statistical approaches: trinucleotide frequencies per frame (Arquès and Michel, 1996), correlation functions per frame (Arquès and Michel, 1997), frame permuted trinucleotide frequencies (Frey and Michel, 2003, 2006), advanced statistical functions at the gene population level (Michel, 2015) and at the gene level (Michel, 2017). All these 3-frame statistical methods analyse the trinucleotide information in the 3 frames of genes: the reading frame and the 2 shifted frames. Notably, codon usage does not allow for the identification of circular codes (Michel, 2020). This has been a long-standing problem since 1996, hindering biologists’ access to circular code theory.</p><p>By considering circular code conditions resulting from code theory, particularly the concept of permutation class, and building upon previous statistical work, a new statistical approach based solely on the codon usage, i.e. a 1-frame statistical method, surprisingly reveals the maximal <span><math><msup><mrow><mi>C</mi></mrow><mrow><mn>3</mn></mrow></msup></math></span> self-complementary trinucleotide circular code <span><math><mi>X</mi></math></span> in bacterial genes and in average (bacterial, archaeal, eukaryotic) genes, and almost in archaeal genes. Additionally, a new parameter definition indicates that bacterial and archaeal genes exhibit codon usage dispersion of the same order of magnitude, but significantly higher than that observed in eukaryotic genes. This statistical finding may explain the greater variability of codes in eukaryotic genes compared to bacterial and archaeal genes, an issue that has been open for many years. Finally, biologists can now search for new (variant) circular codes at both the genome level (across all genes in a given genome) and the gene level using only codon usage, without the need for analysing the shifted frames.</p></div>\",\"PeriodicalId\":50730,\"journal\":{\"name\":\"Biosystems\",\"volume\":\"244 \",\"pages\":\"Article 105308\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biosystems\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S030326472400193X\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030326472400193X","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

自1996年以来,由于以下6种统计方法的发展,基因中的循环密码得以确定:每帧三核苷酸频率(Arquès和Michel,1996年)、每帧相关函数(Arquès和Michel,1997年)、帧包被三核苷酸频率(Frey和Michel,2003年,2006年)、基因群体水平(Michel,2015年)和基因水平(Michel,2017年)的高级统计函数。所有这些三帧统计方法都分析基因三帧中的三核苷酸信息:阅读帧和两个移码帧。值得注意的是,密码子用法无法识别循环密码(Michel,2020)。这是自 1996 年以来一直存在的问题,阻碍了生物学家对循环密码理论的研究。通过考虑编码理论所产生的循环编码条件,特别是排列类的概念,并在以往统计工作的基础上,一种仅基于密码子使用情况的新统计方法(即 1 帧统计方法)令人惊讶地揭示了细菌基因和平均(细菌、古生物、真核生物)基因中最大的 C3 自互补三核苷酸循环编码 X,而且几乎揭示了古生物基因中的最大 C3 自互补三核苷酸循环编码 X。此外,一个新的参数定义表明,细菌基因和古细菌基因的密码子使用分散程度相同,但明显高于真核基因。这一统计发现可能解释了真核生物基因中密码的变异性大于细菌和古细菌基因的原因,而这一问题多年来一直悬而未决。最后,生物学家现在可以在基因组水平(特定基因组中的所有基因)和基因水平上仅使用密码子使用情况来搜索新的(变异)循环密码,而无需分析移码框。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Circular code identified by the codon usage

Since 1996, circular codes in genes have been identified thanks to the development of 6 statistical approaches: trinucleotide frequencies per frame (Arquès and Michel, 1996), correlation functions per frame (Arquès and Michel, 1997), frame permuted trinucleotide frequencies (Frey and Michel, 2003, 2006), advanced statistical functions at the gene population level (Michel, 2015) and at the gene level (Michel, 2017). All these 3-frame statistical methods analyse the trinucleotide information in the 3 frames of genes: the reading frame and the 2 shifted frames. Notably, codon usage does not allow for the identification of circular codes (Michel, 2020). This has been a long-standing problem since 1996, hindering biologists’ access to circular code theory.

By considering circular code conditions resulting from code theory, particularly the concept of permutation class, and building upon previous statistical work, a new statistical approach based solely on the codon usage, i.e. a 1-frame statistical method, surprisingly reveals the maximal C3 self-complementary trinucleotide circular code X in bacterial genes and in average (bacterial, archaeal, eukaryotic) genes, and almost in archaeal genes. Additionally, a new parameter definition indicates that bacterial and archaeal genes exhibit codon usage dispersion of the same order of magnitude, but significantly higher than that observed in eukaryotic genes. This statistical finding may explain the greater variability of codes in eukaryotic genes compared to bacterial and archaeal genes, an issue that has been open for many years. Finally, biologists can now search for new (variant) circular codes at both the genome level (across all genes in a given genome) and the gene level using only codon usage, without the need for analysing the shifted frames.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biosystems
Biosystems 生物-生物学
CiteScore
3.70
自引率
18.80%
发文量
129
审稿时长
34 days
期刊介绍: BioSystems encourages experimental, computational, and theoretical articles that link biology, evolutionary thinking, and the information processing sciences. The link areas form a circle that encompasses the fundamental nature of biological information processing, computational modeling of complex biological systems, evolutionary models of computation, the application of biological principles to the design of novel computing systems, and the use of biomolecular materials to synthesize artificial systems that capture essential principles of natural biological information processing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信