Estimated size of the total genome and protein space of viruses.

IF 3.7 2区 生物学 Q2 MICROBIOLOGY
mSphere Pub Date : 2025-02-25 DOI:10.1128/msphere.00683-24
Congyu Lu, Yifan Wu, Zheng Zhang, Longfei Mao, Xingyi Ge, Aiping Wu, Fengzhu Sun, Yongqiang Jiang, Yousong Peng
{"title":"Estimated size of the total genome and protein space of viruses.","authors":"Congyu Lu, Yifan Wu, Zheng Zhang, Longfei Mao, Xingyi Ge, Aiping Wu, Fengzhu Sun, Yongqiang Jiang, Yousong Peng","doi":"10.1128/msphere.00683-24","DOIUrl":null,"url":null,"abstract":"<p><p>Recent metagenomic studies have identified a vast number of viruses. However, the systematic assessment of the true genetic diversity of the whole virus community on our planet remains to be investigated. Here, we explored the genome and protein space of viruses by simulating the process of virus discovery in viral metagenomic studies. Among multiple functions, the power function was found to best fit the increasing trends of virus diversity and was, therefore, used to predict the genetic space of viruses. The estimate suggests that there are at least 8.23e+08 viral operational taxonomic units and 1.62e+09 viral protein clusters on Earth when assuming the saturation of the virus genetic space, taking into account the balance of costs and the identification of novel viruses. It is noteworthy that less than 3% of the viral genetic diversity has been uncovered thus far, emphasizing the vastness of the unexplored viral landscape. To saturate the genetic space, a total of 3.08e+08 samples would be required. Analysis of viral genetic diversity by ecosystem yielded estimates consistent with those mentioned above. Furthermore, the estimate of the virus genetic space remained robust when accounting for the redundancy of sampling, sampling time, sequencing platform, and parameters used for protein clustering. This study provides a guide for future sequencing efforts in virus discovery and contributes to a better understanding of viral diversity in nature.IMPORTANCEViruses are the most abundant and diverse biological entities on Earth. In recent years, a large number of viruses have been discovered based on sequencing technology. However, it is not clear how many kinds of viruses exist on Earth. This study estimates that there are at least 823 million types of viruses and 1.62 billion types of viral proteins. Remarkably, less than 3% of this large diversity has been uncovered to date. These findings highlight the enormous potential for discovering new viruses and reveal a significant gap in our current understanding of the viral world. This study calls for increased attention and resources to be directed toward viral discovery and metagenomics and provides a guide for future sequencing efforts, enhancing our knowledge of viral diversity in nature for ecology, biology, and public health.</p>","PeriodicalId":19052,"journal":{"name":"mSphere","volume":" ","pages":"e0068324"},"PeriodicalIF":3.7000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"mSphere","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1128/msphere.00683-24","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Recent metagenomic studies have identified a vast number of viruses. However, the systematic assessment of the true genetic diversity of the whole virus community on our planet remains to be investigated. Here, we explored the genome and protein space of viruses by simulating the process of virus discovery in viral metagenomic studies. Among multiple functions, the power function was found to best fit the increasing trends of virus diversity and was, therefore, used to predict the genetic space of viruses. The estimate suggests that there are at least 8.23e+08 viral operational taxonomic units and 1.62e+09 viral protein clusters on Earth when assuming the saturation of the virus genetic space, taking into account the balance of costs and the identification of novel viruses. It is noteworthy that less than 3% of the viral genetic diversity has been uncovered thus far, emphasizing the vastness of the unexplored viral landscape. To saturate the genetic space, a total of 3.08e+08 samples would be required. Analysis of viral genetic diversity by ecosystem yielded estimates consistent with those mentioned above. Furthermore, the estimate of the virus genetic space remained robust when accounting for the redundancy of sampling, sampling time, sequencing platform, and parameters used for protein clustering. This study provides a guide for future sequencing efforts in virus discovery and contributes to a better understanding of viral diversity in nature.IMPORTANCEViruses are the most abundant and diverse biological entities on Earth. In recent years, a large number of viruses have been discovered based on sequencing technology. However, it is not clear how many kinds of viruses exist on Earth. This study estimates that there are at least 823 million types of viruses and 1.62 billion types of viral proteins. Remarkably, less than 3% of this large diversity has been uncovered to date. These findings highlight the enormous potential for discovering new viruses and reveal a significant gap in our current understanding of the viral world. This study calls for increased attention and resources to be directed toward viral discovery and metagenomics and provides a guide for future sequencing efforts, enhancing our knowledge of viral diversity in nature for ecology, biology, and public health.

求助全文
约1分钟内获得全文 求助全文
来源期刊
mSphere
mSphere Immunology and Microbiology-Microbiology
CiteScore
8.50
自引率
2.10%
发文量
192
审稿时长
11 weeks
期刊介绍: mSphere™ is a multi-disciplinary open-access journal that will focus on rapid publication of fundamental contributions to our understanding of microbiology. Its scope will reflect the immense range of fields within the microbial sciences, creating new opportunities for researchers to share findings that are transforming our understanding of human health and disease, ecosystems, neuroscience, agriculture, energy production, climate change, evolution, biogeochemical cycling, and food and drug production. Submissions will be encouraged of all high-quality work that makes fundamental contributions to our understanding of microbiology. mSphere™ will provide streamlined decisions, while carrying on ASM''s tradition for rigorous peer review.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信