Zhongshuai Tian, Tao Hu, Edward C Holmes, Jingkai Ji, Weifeng Shi
{"title":"Analysis of the genetic diversity in RNA-directed RNA polymerase sequences: implications for an automated RNA virus classification system","authors":"Zhongshuai Tian, Tao Hu, Edward C Holmes, Jingkai Ji, Weifeng Shi","doi":"10.1093/ve/veae059","DOIUrl":null,"url":null,"abstract":"RNA viruses are characterized by a broad host range and high levels of genetic diversity. Despite a recent expansion in the known virosphere following metagenomic sequencing, our knowledge of the species-rank genetic diversity of RNA viruses, and how often they are misassigned and misclassified, is limited. We performed a clustering analysis of 7,801 RNA-directed RNA polymerase (RdRp) sequences representing 1,897 established RNA virus species. From this, we identified substantial genetic divergence within some virus species and inconsistency in RNA virus assignment between the GenBank database and The International Committee on Taxonomy of Viruses (ICTV). In particular, 27.57% virus species were comprised of multiple virus operational taxonomic units (vOTUs), including Alphainfluenzavirus influenzae, Mammarenavirus lassaense, Apple stem pitting virus, and Rotavirus A, with each having over 100 vOTUs. In addition, the distribution of average amino acid identity between vOTUs within single assigned species showed a relatively low threshold: <90%, and sometimes <50%. However, when only exemplar sequences from virus species were analyzed, 1,889 of the ICTV-designated RNA virus species (99.58%) were clustered into a single vOTU. Clustering of RdRp sequences from different virus species also revealed that 17 vOTUs contained two distinct virus species. These potential misassignments were confirmed by phylogenetic analysis. A further analysis of ANI (Average Nucleotide Identity) values ranging from 70% to 97.5% revealed that at ANI of 82.5%, 1559 (82.18%) of the 1,897 virus species could be correctly clustered into one single vOTU. However, at ANI values greater than 82.5%, an increasing number of species were clustered into two or more vOTUs. In sum, we have identified some inconsistency and misassignment of RNA virus species based on the analysis of RdRp sequences alone which has important implications for the development of an automated RNA virus classification system.","PeriodicalId":56026,"journal":{"name":"Virus Evolution","volume":"101 1","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virus Evolution","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/ve/veae059","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"VIROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
RNA viruses are characterized by a broad host range and high levels of genetic diversity. Despite a recent expansion in the known virosphere following metagenomic sequencing, our knowledge of the species-rank genetic diversity of RNA viruses, and how often they are misassigned and misclassified, is limited. We performed a clustering analysis of 7,801 RNA-directed RNA polymerase (RdRp) sequences representing 1,897 established RNA virus species. From this, we identified substantial genetic divergence within some virus species and inconsistency in RNA virus assignment between the GenBank database and The International Committee on Taxonomy of Viruses (ICTV). In particular, 27.57% virus species were comprised of multiple virus operational taxonomic units (vOTUs), including Alphainfluenzavirus influenzae, Mammarenavirus lassaense, Apple stem pitting virus, and Rotavirus A, with each having over 100 vOTUs. In addition, the distribution of average amino acid identity between vOTUs within single assigned species showed a relatively low threshold: <90%, and sometimes <50%. However, when only exemplar sequences from virus species were analyzed, 1,889 of the ICTV-designated RNA virus species (99.58%) were clustered into a single vOTU. Clustering of RdRp sequences from different virus species also revealed that 17 vOTUs contained two distinct virus species. These potential misassignments were confirmed by phylogenetic analysis. A further analysis of ANI (Average Nucleotide Identity) values ranging from 70% to 97.5% revealed that at ANI of 82.5%, 1559 (82.18%) of the 1,897 virus species could be correctly clustered into one single vOTU. However, at ANI values greater than 82.5%, an increasing number of species were clustered into two or more vOTUs. In sum, we have identified some inconsistency and misassignment of RNA virus species based on the analysis of RdRp sequences alone which has important implications for the development of an automated RNA virus classification system.
期刊介绍:
Virus Evolution is a new Open Access journal focusing on the long-term evolution of viruses, viruses as a model system for studying evolutionary processes, viral molecular epidemiology and environmental virology.
The aim of the journal is to provide a forum for original research papers, reviews, commentaries and a venue for in-depth discussion on the topics relevant to virus evolution.