An efficient algorithm for exploring RNA branching conformations under the nearest-neighbor thermodynamic model.

IF 1.7 4区 生物学 Q4 BIOCHEMICAL RESEARCH METHODS
Svetlana Poznanović, Owen Cardwell, Christine Heitsch
{"title":"An efficient algorithm for exploring RNA branching conformations under the nearest-neighbor thermodynamic model.","authors":"Svetlana Poznanović, Owen Cardwell, Christine Heitsch","doi":"10.1186/s13015-025-00296-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In the Nearest-Neighbor Thermodynamic Model, a standard approach for RNA secondary structure prediction, the energy of the multiloops is modeled using a linear entropic penalty governed by three branching parameters. Although these parameters are typically fixed, recent work has shown that reparametrizing the multiloop score and considering alternative branching conformations can lead to significantly better structure predictions. However, prior approaches for exploring the alternative branching structures were computationally inefficient for long sequences.</p><p><strong>Results: </strong>We present a novel algorithm that partitions the parameter space, identifying all distinct branching structures (optimal under different branching parameters) for a given RNA sequence using the fewest possible minimum free energy computations. Our method efficiently computes the full parameter-space partition and the associated optimal structures, enabling a comprehensive evaluation of the structural landscape across parameter choices. We apply this algorithm to the Archive II benchmarking dataset, assessing the maximum attainable prediction accuracy for each sequence under the reparameterized multiloop model. We find that the potential for improvement over default predictions is substantial in many cases, and that the optimal prediction accuracy is highly sensitive to auxiliary modeling decisions, such as the treatment of lonely base pairs and dangling ends.</p><p><strong>Conclusion: </strong>Our results support the hypothesis that the conventional choice of multiloop parameters may limit prediction accuracy and that exploring alternative parameterizations is both tractable and worthwhile. The efficient partitioning algorithm we introduce makes this exploration feasible for longer sequences and larger datasets. Furthermore, we identify several open challenges in identifying the optimal structure.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2026-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13151262/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms for Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13015-025-00296-4","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: In the Nearest-Neighbor Thermodynamic Model, a standard approach for RNA secondary structure prediction, the energy of the multiloops is modeled using a linear entropic penalty governed by three branching parameters. Although these parameters are typically fixed, recent work has shown that reparametrizing the multiloop score and considering alternative branching conformations can lead to significantly better structure predictions. However, prior approaches for exploring the alternative branching structures were computationally inefficient for long sequences.

Results: We present a novel algorithm that partitions the parameter space, identifying all distinct branching structures (optimal under different branching parameters) for a given RNA sequence using the fewest possible minimum free energy computations. Our method efficiently computes the full parameter-space partition and the associated optimal structures, enabling a comprehensive evaluation of the structural landscape across parameter choices. We apply this algorithm to the Archive II benchmarking dataset, assessing the maximum attainable prediction accuracy for each sequence under the reparameterized multiloop model. We find that the potential for improvement over default predictions is substantial in many cases, and that the optimal prediction accuracy is highly sensitive to auxiliary modeling decisions, such as the treatment of lonely base pairs and dangling ends.

Conclusion: Our results support the hypothesis that the conventional choice of multiloop parameters may limit prediction accuracy and that exploring alternative parameterizations is both tractable and worthwhile. The efficient partitioning algorithm we introduce makes this exploration feasible for longer sequences and larger datasets. Furthermore, we identify several open challenges in identifying the optimal structure.

在最近邻热力学模型下探索RNA分支构象的有效算法。
背景:在RNA二级结构预测的标准方法——最近邻热力学模型中,多环的能量使用由三个分支参数控制的线性熵惩罚来建模。虽然这些参数通常是固定的,但最近的研究表明,重新参数化多环得分并考虑可选择的分支构象可以导致更好的结构预测。然而,先前用于探索分支结构的方法对于长序列来说计算效率很低。结果:我们提出了一种新的算法来划分参数空间,使用尽可能少的最小自由能计算来识别给定RNA序列的所有不同分支结构(不同分支参数下的最佳结构)。我们的方法有效地计算了全参数空间划分和相关的最优结构,从而能够跨参数选择对结构景观进行综合评估。我们将该算法应用于Archive II基准数据集,评估了在重参数化多环模型下每个序列可达到的最大预测精度。我们发现,在许多情况下,对默认预测的改进潜力是巨大的,并且最佳预测精度对辅助建模决策高度敏感,例如孤立碱基对和悬垂末端的处理。结论:我们的研究结果支持这样的假设,即传统的多环参数选择可能会限制预测精度,探索替代参数化既容易又值得。我们引入的高效划分算法使得这种探索对于更长的序列和更大的数据集是可行的。此外,我们确定了确定最佳结构的几个开放挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Algorithms for Molecular Biology
Algorithms for Molecular Biology 生物-生化研究方法
CiteScore
2.40
自引率
10.00%
发文量
16
审稿时长
>12 weeks
期刊介绍: Algorithms for Molecular Biology publishes articles on novel algorithms for biological sequence and structure analysis, phylogeny reconstruction, and combinatorial algorithms and machine learning. Areas of interest include but are not limited to: algorithms for RNA and protein structure analysis, gene prediction and genome analysis, comparative sequence analysis and alignment, phylogeny, gene expression, machine learning, and combinatorial algorithms. Where appropriate, manuscripts should describe applications to real-world data. However, pure algorithm papers are also welcome if future applications to biological data are to be expected, or if they address complexity or approximation issues of novel computational problems in molecular biology. Articles about novel software tools will be considered for publication if they contain some algorithmically interesting aspects.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书