A proposed new measure to verify the general version of Chargaff 2nd rule.

Camilo Fuentes Beals, Gonzalo Riadi Mahias, Karen Y. Oróstica, Ignacio Vidal
{"title":"A proposed new measure to verify the general version of Chargaff 2nd rule.","authors":"Camilo Fuentes Beals, Gonzalo Riadi Mahias, Karen Y. Oróstica, Ignacio Vidal","doi":"10.3390/MOL2NET-04-06090","DOIUrl":null,"url":null,"abstract":"In the 40’s, Erwin Chargaff was the first to observe the parity between Adenines (A) and Timinies (T) and Citosines (C) and Guanines (G), in the molecule of DNA. In the 60’s, Chargaff found a second parity rule. This time in a single strand of DNA. The amounts of A’s and T’s, and the amounts of C’s and G’s is similar. The explanation of the first rule is the complementary nature of the double stranded helix of the DNA molecule. However, for the 2nd rule, a biological explanation has remained a mystery. In the last 40 years, a generalization of the second rule was proposed, to explain the 2nd rule as a particular case. This generalization states that for any given k-mer and its reverse complement (RC), the number of times both are found is similar in a single strand of DNA. Two measures have been proposed to test the generalized Chargaff’s 2nd rule (gC2r), both include an artifact regarding the length of the genomes. This has led the authors to think there is a minimum length of a genome and a maximum k-mer for compliance. We propose a new way to measure the compliance of any given genome to the gC2r. The measure is the proportion of the genome which complies with gC2r. The compliance is measured per pair of kmer/k-merRC, using the natural logarithm of the number of times the k-mer is found, divided by the number of times its reverse complement is found in the genome or ln(#k-mer/k-merRC). This measure is independent of the size of the analyzed k-mer and the size of the genome. This measure has been implemented in a software, ChargaffCracker, which can rapidly analyze sequences and deliver a statistical report. We have generated random genomes based on the proportions and lengths of biological prokaryote genome sequences and compared them. We conclude hypothesizing that: 1. The compliance of the gC2r is a consequence, not cause of the 2nd rule and; 2. Although Chargaff’s 2nd rule might be a consequence of transpositions and inversions, the limits of compliance of the gC2r is a property of the sequence model of genomes, not of the biology of organisms. However, this property might have been selected to fulfill biological needs in genome evolution.","PeriodicalId":20475,"journal":{"name":"Proceedings of MOL2NET 2018, International Conference on Multidisciplinary Sciences, 4th edition","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of MOL2NET 2018, International Conference on Multidisciplinary Sciences, 4th edition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/MOL2NET-04-06090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In the 40’s, Erwin Chargaff was the first to observe the parity between Adenines (A) and Timinies (T) and Citosines (C) and Guanines (G), in the molecule of DNA. In the 60’s, Chargaff found a second parity rule. This time in a single strand of DNA. The amounts of A’s and T’s, and the amounts of C’s and G’s is similar. The explanation of the first rule is the complementary nature of the double stranded helix of the DNA molecule. However, for the 2nd rule, a biological explanation has remained a mystery. In the last 40 years, a generalization of the second rule was proposed, to explain the 2nd rule as a particular case. This generalization states that for any given k-mer and its reverse complement (RC), the number of times both are found is similar in a single strand of DNA. Two measures have been proposed to test the generalized Chargaff’s 2nd rule (gC2r), both include an artifact regarding the length of the genomes. This has led the authors to think there is a minimum length of a genome and a maximum k-mer for compliance. We propose a new way to measure the compliance of any given genome to the gC2r. The measure is the proportion of the genome which complies with gC2r. The compliance is measured per pair of kmer/k-merRC, using the natural logarithm of the number of times the k-mer is found, divided by the number of times its reverse complement is found in the genome or ln(#k-mer/k-merRC). This measure is independent of the size of the analyzed k-mer and the size of the genome. This measure has been implemented in a software, ChargaffCracker, which can rapidly analyze sequences and deliver a statistical report. We have generated random genomes based on the proportions and lengths of biological prokaryote genome sequences and compared them. We conclude hypothesizing that: 1. The compliance of the gC2r is a consequence, not cause of the 2nd rule and; 2. Although Chargaff’s 2nd rule might be a consequence of transpositions and inversions, the limits of compliance of the gC2r is a property of the sequence model of genomes, not of the biology of organisms. However, this property might have been selected to fulfill biological needs in genome evolution.
提出了一种验证Chargaff第二规则一般版本的新方法。
在20世纪40年代,Erwin Chargaff是第一个在DNA分子中观察到腺嘌呤(A)和腺嘌呤(T)以及柠檬酸嘧啶(C)和鸟嘌呤(G)之间的对等关系的人。在60年代,Chargaff发现了第二个宇称规则。这次是在单链DNA中。A和T的数量,C和G的数量是相似的。第一条规则的解释是DNA分子双螺旋结构的互补性。然而,对于第二条规则,生物学上的解释仍然是个谜。在过去的40年里,提出了第二条规则的概括,将第二条规则解释为特殊情况。这种概括表明,对于任何给定的k-mer及其反向补体(RC),两者在单链DNA中被发现的次数是相似的。已经提出了两种方法来测试广义Chargaff第二规则(gC2r),这两种方法都包含了一个关于基因组长度的伪命题。这使得作者们认为存在最小基因组长度和最大k-mer的顺应性。我们提出了一种新的方法来测量任何给定基因组对gC2r的顺应性。衡量标准是基因组中符合gC2r的比例。每对kmer/k-merRC的顺应性测量,使用k-mer被发现的次数的自然对数,除以其反向补体在基因组或ln中被发现的次数(#k-mer/k-merRC)。该测量与分析的k-mer的大小和基因组的大小无关。该措施已在ChargaffCracker软件中实现,该软件可以快速分析序列并提供统计报告。我们根据生物原核生物基因组序列的比例和长度生成了随机基因组,并对它们进行了比较。我们的结论假设:1。gC2r的遵守是第二条规则的结果,而不是原因;2. 虽然Chargaff第二规则可能是调换和倒位的结果,但gC2r的顺应性限制是基因组序列模型的特性,而不是生物体生物学的特性。然而,这种特性可能是为了满足基因组进化的生物学需要而被选择的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信