{"title":"Unraveling Unbreakable Hairpins: Characterizing RNA secondary structures that are persistent after dinucleotide shuffling.","authors":"Alyssa Pratt, David Anthony Hendrix","doi":"10.1261/rna.080176.124","DOIUrl":null,"url":null,"abstract":"<p><p>The sequence of nucleotides that make up an RNA determines its structure, which determines its function. The RNA hairpin, also known as a stem-loop, is a ubiquitous and fundamental feature of RNA secondary structure. A common method of randomizing an RNA sequence is dinucleotide shuffling with the Altschul-Erickson algorithm, which preserves the dinucleotide content of the sequence. This algorithm generates randomized sequences by sampling Eulerian paths through the de Bruijn graph representation of the original sequence. We identified a subset of RNA hairpins in the bpRNA-1m meta-database that always form hairpins after repeated application of dinucleotide shuffling. We investigated these \"unbreakable hairpins\" and found several common properties. First, we found that unbreakable hairpins had on average similar folding energies compared to other hairpins of similar lengths, although they frequently contained ultra-stable hairpin loops. We found that they tend to be split by purines and pyrimidines on opposite sides of the stem. Furthermore, we found that this specific sequence feature restricts the number of distinct Eulerian paths through their de Bruijn graph representation, resulting in a small number of distinguishable dinucleotide-shuffled sequences. Beyond this algorithmic means of identification, these distinct sequences may have biological significance because we found that a significant percentage occur in a specific location of 16S ribosomal RNAs. Finally, we present a formula to calculate the number of possible unique dinucleotide shuffled sequences for an input RNA sequence, which has utility for the general application of the Altschul-Erickson algorithm.</p>","PeriodicalId":21401,"journal":{"name":"RNA","volume":" ","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"RNA","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1261/rna.080176.124","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The sequence of nucleotides that make up an RNA determines its structure, which determines its function. The RNA hairpin, also known as a stem-loop, is a ubiquitous and fundamental feature of RNA secondary structure. A common method of randomizing an RNA sequence is dinucleotide shuffling with the Altschul-Erickson algorithm, which preserves the dinucleotide content of the sequence. This algorithm generates randomized sequences by sampling Eulerian paths through the de Bruijn graph representation of the original sequence. We identified a subset of RNA hairpins in the bpRNA-1m meta-database that always form hairpins after repeated application of dinucleotide shuffling. We investigated these "unbreakable hairpins" and found several common properties. First, we found that unbreakable hairpins had on average similar folding energies compared to other hairpins of similar lengths, although they frequently contained ultra-stable hairpin loops. We found that they tend to be split by purines and pyrimidines on opposite sides of the stem. Furthermore, we found that this specific sequence feature restricts the number of distinct Eulerian paths through their de Bruijn graph representation, resulting in a small number of distinguishable dinucleotide-shuffled sequences. Beyond this algorithmic means of identification, these distinct sequences may have biological significance because we found that a significant percentage occur in a specific location of 16S ribosomal RNAs. Finally, we present a formula to calculate the number of possible unique dinucleotide shuffled sequences for an input RNA sequence, which has utility for the general application of the Altschul-Erickson algorithm.
期刊介绍:
RNA is a monthly journal which provides rapid publication of significant original research in all areas of RNA structure and function in eukaryotic, prokaryotic, and viral systems. It covers a broad range of subjects in RNA research, including: structural analysis by biochemical or biophysical means; mRNA structure, function and biogenesis; alternative processing: cis-acting elements and trans-acting factors; ribosome structure and function; translational control; RNA catalysis; tRNA structure, function, biogenesis and identity; RNA editing; rRNA structure, function and biogenesis; RNA transport and localization; regulatory RNAs; large and small RNP structure, function and biogenesis; viral RNA metabolism; RNA stability and turnover; in vitro evolution; and RNA chemistry.