{"title":"Identification of diversity-generating retroelements in host-associated and environmental genomes: prevalence, diversity, and roles.","authors":"Mariela Carrasco-Villanueva, Chaoxian Wang, Chaochun Wei","doi":"10.1186/s12864-024-11124-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The diversity-generating retroelements (DGRs) are a family of genetic elements that can produce mutations in target genes often related to ligand-binding functions, which possess a C-type lectin (CLec) domain that tolerates massive variations. They were first identified in viruses, then in bacteria and archaea from human-associated and environmental genomes. This DGR mechanism represents a fast adaptation of organisms to ever- changing environments. However, their existence, phylogenetic and structural diversity, and functions in a wide range of environments are largely unknown.</p><p><strong>Results: </strong>Here we present a study of DGR systems based on metagenome-assembled genomes (MAGs) from host-associated, aquatic, terrestrial and engineered environments. In total, we identified 861 non-redundant DGR-RTs and ~ 5.7% are new. We found that microbes associated with human hosts harbor the highest number of DGRs and also exhibit a higher prevalence of DGRs. After normalizing with genome size and including more genome data, we found that DGRs occur more frequently in organisms with smaller genomes. Overall, we identified nine main clades in the phylogenetic tree of reverse transcriptases (RTs), some comprising specific phyla and cassette architectures. We identified 38 different cassette patterns and 6 of them were shown in at least 10 DGRs, showing differences in terms of the numbers, arrangements, and orientations of their components. Finally, most of the target genes were related to ligand-binding and signaling functions, but we discovered a few cases in which the VRs were situated in domains different from the CLec.</p><p><strong>Conclusions: </strong>Our research sheds light on the widespread prevalence of DGRs within environments and taxa, and supports the DGR phylogenetic divergence in different organisms. These variations might also occur in their structures since some cassette architectures were common in specific underrepresented phyla. In addition, we suggest that VRs could be found in domains different to the CLec, which should be further explored for organisms in scarcely studied environments.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1227"},"PeriodicalIF":3.5000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12864-024-11124-1","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The diversity-generating retroelements (DGRs) are a family of genetic elements that can produce mutations in target genes often related to ligand-binding functions, which possess a C-type lectin (CLec) domain that tolerates massive variations. They were first identified in viruses, then in bacteria and archaea from human-associated and environmental genomes. This DGR mechanism represents a fast adaptation of organisms to ever- changing environments. However, their existence, phylogenetic and structural diversity, and functions in a wide range of environments are largely unknown.
Results: Here we present a study of DGR systems based on metagenome-assembled genomes (MAGs) from host-associated, aquatic, terrestrial and engineered environments. In total, we identified 861 non-redundant DGR-RTs and ~ 5.7% are new. We found that microbes associated with human hosts harbor the highest number of DGRs and also exhibit a higher prevalence of DGRs. After normalizing with genome size and including more genome data, we found that DGRs occur more frequently in organisms with smaller genomes. Overall, we identified nine main clades in the phylogenetic tree of reverse transcriptases (RTs), some comprising specific phyla and cassette architectures. We identified 38 different cassette patterns and 6 of them were shown in at least 10 DGRs, showing differences in terms of the numbers, arrangements, and orientations of their components. Finally, most of the target genes were related to ligand-binding and signaling functions, but we discovered a few cases in which the VRs were situated in domains different from the CLec.
Conclusions: Our research sheds light on the widespread prevalence of DGRs within environments and taxa, and supports the DGR phylogenetic divergence in different organisms. These variations might also occur in their structures since some cassette architectures were common in specific underrepresented phyla. In addition, we suggest that VRs could be found in domains different to the CLec, which should be further explored for organisms in scarcely studied environments.
期刊介绍:
BMC Genomics is an open access, peer-reviewed journal that considers articles on all aspects of genome-scale analysis, functional genomics, and proteomics.
BMC Genomics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.