Structural Features of the Nucleotide Sequences of Genomes

M. Takeda, M. Nakahara
{"title":"Structural Features of the Nucleotide Sequences of Genomes","authors":"M. Takeda, M. Nakahara","doi":"10.2751/JCAC.10.38","DOIUrl":null,"url":null,"abstract":"We propose structural features of genomic DNA, which are essential to generate and to analyze genome. We calculated the appearance frequency of the nucleotides (bases) of throughout the entire genome as a polynucleotide molecule consisting of Adenine (A), Thymine (T), Guanine (G) and Cytosine (C) bases including the coding- and the non-coding regions, primarily in the genomes of Saccharomyces cerevisiae, Escherichia coli, and Homo sapiens. Our results indicate that the base sequences in a single-strand of DNA have the following characteristics: (1) reverse-complement symmetry of 3-9 successive bases, (2) bias and (3) multiple fractality of the distribution of four bases, A, T, G and C depending on the distance, exponentially decreased at short distances and linearly decreased at long distances in double logarithmic plot (power spectrum) of L (the distance of a base to the next base) vs P (L) (the probability of the base-distribution at L). These structural features of a single-strand of DNA can be clearly observed in any genomic DNA, especially observed remarkable in eukaryotic genome. Whereas in the artificial genomes or chromosomes with the same base-numbers, the same base-contents and the same frequencies of 64 triplets, the bias and the linearly-decreased fractality of the distribution of four bases described the above were missing, although the reverse-complement symmetry of the base sequences and the exponentially decreased-fractality of the base distribution were observed.","PeriodicalId":41457,"journal":{"name":"Journal of Computer Aided Chemistry","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Aided Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2751/JCAC.10.38","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

We propose structural features of genomic DNA, which are essential to generate and to analyze genome. We calculated the appearance frequency of the nucleotides (bases) of throughout the entire genome as a polynucleotide molecule consisting of Adenine (A), Thymine (T), Guanine (G) and Cytosine (C) bases including the coding- and the non-coding regions, primarily in the genomes of Saccharomyces cerevisiae, Escherichia coli, and Homo sapiens. Our results indicate that the base sequences in a single-strand of DNA have the following characteristics: (1) reverse-complement symmetry of 3-9 successive bases, (2) bias and (3) multiple fractality of the distribution of four bases, A, T, G and C depending on the distance, exponentially decreased at short distances and linearly decreased at long distances in double logarithmic plot (power spectrum) of L (the distance of a base to the next base) vs P (L) (the probability of the base-distribution at L). These structural features of a single-strand of DNA can be clearly observed in any genomic DNA, especially observed remarkable in eukaryotic genome. Whereas in the artificial genomes or chromosomes with the same base-numbers, the same base-contents and the same frequencies of 64 triplets, the bias and the linearly-decreased fractality of the distribution of four bases described the above were missing, although the reverse-complement symmetry of the base sequences and the exponentially decreased-fractality of the base distribution were observed.
基因组核苷酸序列的结构特征
我们提出了基因组DNA的结构特征,这是产生和分析基因组所必需的。我们计算了整个基因组中核苷酸(碱基)的出现频率,作为一个由腺嘌呤(a)、胸腺嘧啶(T)、鸟嘌呤(G)和胞嘧啶(C)碱基组成的多核苷酸分子,包括编码区和非编码区,主要存在于酿酒酵母、大肠杆菌和智人的基因组中。我们的结果表明,单链DNA的碱基序列具有以下特点:(1) reverse-complement对称连续3 - 9的基地,(2)偏见和(3)多呈不规则碎片形分布的四个基地,A、T、G和C根据距离,指数下降在短距离和长距离线性下降在双对数图(功率谱)的L(基地到下一个基地的距离)与P (L) (base-distribution的概率在L)。这些结构特点的带着一长串的DNA可以清楚地观察到在任何基因组DNA,在真核生物基因组中尤其显著。而在具有相同碱基数、相同碱基含量和相同频率的64个三胞胎的人工基因组或染色体中,虽然观察到碱基序列的逆补对称和碱基分布的指数递减,但不存在上述四种碱基分布的偏置和线性递减的分形。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computer Aided Chemistry
Journal of Computer Aided Chemistry CHEMISTRY, MULTIDISCIPLINARY-
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信