Md. Aminur Islam , Md Gulam Jilani , Mehboob Hoque , Safdar Ali
{"title":"腺病毒科病毒的基因组微卫星特征分析","authors":"Md. Aminur Islam , Md Gulam Jilani , Mehboob Hoque , Safdar Ali","doi":"10.1016/j.microb.2024.100157","DOIUrl":null,"url":null,"abstract":"<div><p>Simple sequence repeats (SSRs) are present across both coding & non-coding regions of the genomes of all organisms with diversity in incidence, complexity, and repetition. The present study is focused on <em>Adenoviridae</em> which are non-enveloped viruses consisting of linear double-stranded DNA genome. The members of <em>Adenoviridae</em> are known to cause respiratory illnesses like the common cold (cough, runny nose, mild fever), pneumonia (occasionally), keratoconjunctivitis (infection in the eye known as pink eye), croup and cystitis (inflammation of the bladder). Our investigation aims to extract and analyse SSRs from 68 <em>Adenoviridae</em> genomes. Virus genome sequences were retrieved from NCBI and SSRs extracted from MISA. ETE3 and iTOL were used for the phylogenetic tree construction, annotation and visualization. The genome length of <em>Adenoviridae</em> genomes ranged from 26163 bp to 45667 bp while GC content varied from 33.6 % to 66.9 %. Genome wide analysis revealed a total incidence of 9861 SSRs and 793 cSSRs. The minimum and maximum range of SSR incidence is 112 (A67) to 203 (A61) respectively. The most prevalent mono, di and tri-SSR motif is “A”, “GC/CG” and “GAG/CTC” comprised of 1177, 1859 and 201 occurrences respectively. About 78 % SSRs are present in the coding region in the studied genomes. In terms of protein specific distribution, DNA polymerase enzyme had the highest incidence of 485 SSRs. The presence of mono-SSRs in the A/T region is a marker for host determination and divergence. The average mono-SSRs present in A/T region is 67.14 % and it ranged from 16.22 % to 99.11 %. The high prevalence of mono-SSRs in the A/T region was associated with human and related species as hosts. Further, the clustering of viruses as per their hosts was observed in the phylogenetic tree suggesting the role of host in viral evolution. The presence of unique and conserved cSSRs as genome markers has also been highlighted.</p></div>","PeriodicalId":101246,"journal":{"name":"The Microbe","volume":"4 ","pages":"Article 100157"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2950194624001249/pdfft?md5=a46dc16ec6f9b0f24556a43f32d66ced&pid=1-s2.0-S2950194624001249-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Genome microsatellite signature analysis in Adenoviridae family of viruses\",\"authors\":\"Md. Aminur Islam , Md Gulam Jilani , Mehboob Hoque , Safdar Ali\",\"doi\":\"10.1016/j.microb.2024.100157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Simple sequence repeats (SSRs) are present across both coding & non-coding regions of the genomes of all organisms with diversity in incidence, complexity, and repetition. The present study is focused on <em>Adenoviridae</em> which are non-enveloped viruses consisting of linear double-stranded DNA genome. The members of <em>Adenoviridae</em> are known to cause respiratory illnesses like the common cold (cough, runny nose, mild fever), pneumonia (occasionally), keratoconjunctivitis (infection in the eye known as pink eye), croup and cystitis (inflammation of the bladder). Our investigation aims to extract and analyse SSRs from 68 <em>Adenoviridae</em> genomes. Virus genome sequences were retrieved from NCBI and SSRs extracted from MISA. ETE3 and iTOL were used for the phylogenetic tree construction, annotation and visualization. The genome length of <em>Adenoviridae</em> genomes ranged from 26163 bp to 45667 bp while GC content varied from 33.6 % to 66.9 %. Genome wide analysis revealed a total incidence of 9861 SSRs and 793 cSSRs. The minimum and maximum range of SSR incidence is 112 (A67) to 203 (A61) respectively. The most prevalent mono, di and tri-SSR motif is “A”, “GC/CG” and “GAG/CTC” comprised of 1177, 1859 and 201 occurrences respectively. About 78 % SSRs are present in the coding region in the studied genomes. In terms of protein specific distribution, DNA polymerase enzyme had the highest incidence of 485 SSRs. The presence of mono-SSRs in the A/T region is a marker for host determination and divergence. The average mono-SSRs present in A/T region is 67.14 % and it ranged from 16.22 % to 99.11 %. The high prevalence of mono-SSRs in the A/T region was associated with human and related species as hosts. Further, the clustering of viruses as per their hosts was observed in the phylogenetic tree suggesting the role of host in viral evolution. The presence of unique and conserved cSSRs as genome markers has also been highlighted.</p></div>\",\"PeriodicalId\":101246,\"journal\":{\"name\":\"The Microbe\",\"volume\":\"4 \",\"pages\":\"Article 100157\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2950194624001249/pdfft?md5=a46dc16ec6f9b0f24556a43f32d66ced&pid=1-s2.0-S2950194624001249-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Microbe\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2950194624001249\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Microbe","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2950194624001249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
简单序列重复序列(SSR)存在于所有生物基因组的编码区和非编码区,其发生率、复杂性和重复性各不相同。本研究的重点是腺病毒科(Adenoviridae),这是一种由线性双链 DNA 基因组组成的无包膜病毒。已知腺病毒科成员可引起呼吸道疾病,如普通感冒(咳嗽、流鼻涕、轻度发烧)、肺炎(偶尔)、角膜结膜炎(眼睛感染,称为红眼病)、咳嗽和膀胱炎(膀胱发炎)。我们的调查旨在从 68 个腺病毒科基因组中提取和分析 SSR。我们从 NCBI 检索了病毒基因组序列,并从 MISA 提取了 SSR。ETE3 和 iTOL 被用于系统发生树的构建、注释和可视化。腺病毒科基因组的长度从 26163 bp 到 45667 bp 不等,GC 含量从 33.6 % 到 66.9 % 不等。全基因组分析显示,共有 9861 个 SSR 和 793 个 cSSR。SSR 发生率的最小和最大范围分别为 112(A67)到 203(A61)。最常见的单SSR、双SSR和三SSR基序是 "A"、"GC/CG "和 "GAG/CTC",分别出现了1177、1859和201次。在所研究的基因组中,约 78% 的 SSR 存在于编码区。在蛋白质特异性分布方面,DNA 聚合酶的 SSR 发生率最高,达 485 个。在 A/T 区域出现的单 SSR 是宿主确定和分化的标志。A/T区域存在的单SSR平均为67.14%,范围从16.22%到99.11%不等。A/T区域单SSR的高流行率与人类和相关物种作为宿主有关。此外,在系统进化树中还观察到病毒根据宿主进行聚类,这表明宿主在病毒进化中的作用。作为基因组标记的独特和保守的 cSSRs 的存在也得到了强调。
Genome microsatellite signature analysis in Adenoviridae family of viruses
Simple sequence repeats (SSRs) are present across both coding & non-coding regions of the genomes of all organisms with diversity in incidence, complexity, and repetition. The present study is focused on Adenoviridae which are non-enveloped viruses consisting of linear double-stranded DNA genome. The members of Adenoviridae are known to cause respiratory illnesses like the common cold (cough, runny nose, mild fever), pneumonia (occasionally), keratoconjunctivitis (infection in the eye known as pink eye), croup and cystitis (inflammation of the bladder). Our investigation aims to extract and analyse SSRs from 68 Adenoviridae genomes. Virus genome sequences were retrieved from NCBI and SSRs extracted from MISA. ETE3 and iTOL were used for the phylogenetic tree construction, annotation and visualization. The genome length of Adenoviridae genomes ranged from 26163 bp to 45667 bp while GC content varied from 33.6 % to 66.9 %. Genome wide analysis revealed a total incidence of 9861 SSRs and 793 cSSRs. The minimum and maximum range of SSR incidence is 112 (A67) to 203 (A61) respectively. The most prevalent mono, di and tri-SSR motif is “A”, “GC/CG” and “GAG/CTC” comprised of 1177, 1859 and 201 occurrences respectively. About 78 % SSRs are present in the coding region in the studied genomes. In terms of protein specific distribution, DNA polymerase enzyme had the highest incidence of 485 SSRs. The presence of mono-SSRs in the A/T region is a marker for host determination and divergence. The average mono-SSRs present in A/T region is 67.14 % and it ranged from 16.22 % to 99.11 %. The high prevalence of mono-SSRs in the A/T region was associated with human and related species as hosts. Further, the clustering of viruses as per their hosts was observed in the phylogenetic tree suggesting the role of host in viral evolution. The presence of unique and conserved cSSRs as genome markers has also been highlighted.