{"title":"Coding Structure for the ORF1ab, S, M and N Coronavirus Genes","authors":"M. Chaley, Zh.S. Tyulko, V. Kutyrkin","doi":"10.17537/2020.15.441","DOIUrl":null,"url":null,"abstract":"\nSpectral-statistical approach was applied to comparative analysis of coronavirus genomes from the four genus Alphacoronavirus, Betacoronavirus (including new SARS-CoV-2 virus), Gammacoronavirus and Deltacoronavirus. This analysis was done from the point of view of 3-regularity and latent triplet profile periodicity existence in the coding sequences of four structural genes: ORF1ab encoding transcriptase; S-gene of glycoprotein forming spikes; M-gene of membrane protein; N-gene of nucleoprotein. A whole number of the genomes analyzed was equal to 3410. Gene numbers in each of the four groups in the study respectively were the same. In the result, practically, in the CDSs of all analyzed genes of ORF1ab, S and N the latent profile triplet periodicity was revealed and high value of 3-regularity index, being a quality estimate of coding triplet structure conservation, was determined. On the contrary, for coding structure of M-genes a tendency was revealed to diffuse up to homogeneity for 60 % of the genes in the genomes of alphacoronaviruses analyzed and for 67 % of the genes of the gammacoronaviruses. Tendency of the such structure diffusion, being accompanied by decrease of 3-regularity index average value in comparison with other genes, while the triplet profile periodicity remains saved, was also noted for M-genes of SARS-CoV-2 viruses. Probably, this tendency reflects a significance of M-genes variability in coronavirus adaptation to the novel hosts of genus. Analysis of 3-profile periodicity matrices of the four groups of SARS-CoV-2 genes considered in the work, for the viruses isolated in Europe, Asia and USA, did not revealed their significant difference, that is allowing to propose a single source of this virus propagation. \n","PeriodicalId":53525,"journal":{"name":"Mathematical Biology and Bioinformatics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17537/2020.15.441","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 1
Abstract
Spectral-statistical approach was applied to comparative analysis of coronavirus genomes from the four genus Alphacoronavirus, Betacoronavirus (including new SARS-CoV-2 virus), Gammacoronavirus and Deltacoronavirus. This analysis was done from the point of view of 3-regularity and latent triplet profile periodicity existence in the coding sequences of four structural genes: ORF1ab encoding transcriptase; S-gene of glycoprotein forming spikes; M-gene of membrane protein; N-gene of nucleoprotein. A whole number of the genomes analyzed was equal to 3410. Gene numbers in each of the four groups in the study respectively were the same. In the result, practically, in the CDSs of all analyzed genes of ORF1ab, S and N the latent profile triplet periodicity was revealed and high value of 3-regularity index, being a quality estimate of coding triplet structure conservation, was determined. On the contrary, for coding structure of M-genes a tendency was revealed to diffuse up to homogeneity for 60 % of the genes in the genomes of alphacoronaviruses analyzed and for 67 % of the genes of the gammacoronaviruses. Tendency of the such structure diffusion, being accompanied by decrease of 3-regularity index average value in comparison with other genes, while the triplet profile periodicity remains saved, was also noted for M-genes of SARS-CoV-2 viruses. Probably, this tendency reflects a significance of M-genes variability in coronavirus adaptation to the novel hosts of genus. Analysis of 3-profile periodicity matrices of the four groups of SARS-CoV-2 genes considered in the work, for the viruses isolated in Europe, Asia and USA, did not revealed their significant difference, that is allowing to propose a single source of this virus propagation.