Eugene Yui-Ching Chow, Jieyu Zhao, Chun Kit Kwok, Ting-Fung Chan
{"title":"Differential Evolution of CDS and UTR Non-canonical RNA G-quadruplex Structures in Eukaryotic Transcriptomes.","authors":"Eugene Yui-Ching Chow, Jieyu Zhao, Chun Kit Kwok, Ting-Fung Chan","doi":"10.1093/gpbjnl/qzaf078","DOIUrl":null,"url":null,"abstract":"<p><p>RNA G-quadruplexes (rG4s) are non-classical, four-stranded secondary RNA structures that play regulatory roles in various biological processes. Although canonical rG4s have been studied extensively, recent advancements have underscored the importance of non-canonical rG4s. In this study, we experimentally determined rG4 structures from multiple eukaryotic species. Bioinformatic analysis revealed that across 1 billion years of evolution, rG4s have comprised an integral feature of eukaryotic transcriptomes; additionally, non-canonical rG4s consistently were found to dominate the surveyed rG4omes. Over time, the overall size of the rG4ome has expanded progressively, accompanied by a notable compositional shift such that untranslated region (UTR) rG4s became favored over protein coding-sequence (CDS) rG4s. Additionally, we observed distinct evolutionary patterns for CDS and UTR rG4s, which involved differential evolutionary origins and canonicality drift patterns. Our findings suggest that new UTR rG4 sequences emerged rapidly during early mammalian evolution, whereas the more gradual increase in CDS rG4s is linked to changes in selective amino acid residue preferences. This plausible theory accounts for both the prevalence of UTR rG4s and the emergence of canonical motifs in mammalian models. Access to all of the rG4 structures identified in this study is available through the rG4-seq Database application at https://rg4s.science/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9000,"publicationDate":"2025-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, proteomics & bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/gpbjnl/qzaf078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
RNA G-quadruplexes (rG4s) are non-classical, four-stranded secondary RNA structures that play regulatory roles in various biological processes. Although canonical rG4s have been studied extensively, recent advancements have underscored the importance of non-canonical rG4s. In this study, we experimentally determined rG4 structures from multiple eukaryotic species. Bioinformatic analysis revealed that across 1 billion years of evolution, rG4s have comprised an integral feature of eukaryotic transcriptomes; additionally, non-canonical rG4s consistently were found to dominate the surveyed rG4omes. Over time, the overall size of the rG4ome has expanded progressively, accompanied by a notable compositional shift such that untranslated region (UTR) rG4s became favored over protein coding-sequence (CDS) rG4s. Additionally, we observed distinct evolutionary patterns for CDS and UTR rG4s, which involved differential evolutionary origins and canonicality drift patterns. Our findings suggest that new UTR rG4 sequences emerged rapidly during early mammalian evolution, whereas the more gradual increase in CDS rG4s is linked to changes in selective amino acid residue preferences. This plausible theory accounts for both the prevalence of UTR rG4s and the emergence of canonical motifs in mammalian models. Access to all of the rG4 structures identified in this study is available through the rG4-seq Database application at https://rg4s.science/.