{"title":"An efficient algorithm to compute the minimum free energy of interacting nucleic acid strands","authors":"Ahmed Shalaby, Damien Woods","doi":"arxiv-2407.09676","DOIUrl":null,"url":null,"abstract":"The information-encoding molecules RNA and DNA form a combinatorially large\nset of secondary structures through nucleic acid base pairing. Thermodynamic\nprediction algorithms predict favoured, or minimum free energy (MFE), secondary\nstructures, and can assign an equilibrium probability to any structure via the\npartition function: a Boltzman-weighted sum over the set of secondary\nstructures. MFE is NP-hard in the presence pseudoknots, base pairings that\nviolate a restricted planarity condition. However, unpseudoknotted structures\nare amenable to dynamic programming: for a single DNA/RNA strand there are\npolynomial time algorithms for MFE and partition function. For multiple\nstrands, the problem is more complicated due to entropic penalties. Dirks et al\n[SICOMP Review; 2007] showed that for O(1) strands, with N bases, there is a\npolynomial time in N partition function algorithm, however their technique did\nnot generalise to MFE which they left open. We give the first polynomial time\n(O(N^4)) algorithm for unpseudoknotted multiple (O(1)) strand MFE, answering\nthe open problem from Dirks et al. The challenge lies in considering rotational\nsymmetry of secondary structures, a feature not immediately amenable to dynamic\nprogramming algorithms. Our proof has two main technical contributions: First,\na polynomial upper bound on the number of symmetric secondary structures to be\nconsidered when computing rotational symmetry penalties. Second, that bound is\nleveraged by a backtracking algorithm to find the MFE in an exponential space\nof contenders. Our MFE algorithm has the same asymptotic run time as Dirks et\nal's partition function algorithm, suggesting efficient handling of rotational\nsymmetry, although higher space complexity. It also seems reasonably tight in\nthe number of strands since Codon, Hajiaghayi & Thachuk [DNA27, 2021] have\nshown that unpseudoknotted MFE is NP-hard for O(N) strands.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"12 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.09676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The information-encoding molecules RNA and DNA form a combinatorially large
set of secondary structures through nucleic acid base pairing. Thermodynamic
prediction algorithms predict favoured, or minimum free energy (MFE), secondary
structures, and can assign an equilibrium probability to any structure via the
partition function: a Boltzman-weighted sum over the set of secondary
structures. MFE is NP-hard in the presence pseudoknots, base pairings that
violate a restricted planarity condition. However, unpseudoknotted structures
are amenable to dynamic programming: for a single DNA/RNA strand there are
polynomial time algorithms for MFE and partition function. For multiple
strands, the problem is more complicated due to entropic penalties. Dirks et al
[SICOMP Review; 2007] showed that for O(1) strands, with N bases, there is a
polynomial time in N partition function algorithm, however their technique did
not generalise to MFE which they left open. We give the first polynomial time
(O(N^4)) algorithm for unpseudoknotted multiple (O(1)) strand MFE, answering
the open problem from Dirks et al. The challenge lies in considering rotational
symmetry of secondary structures, a feature not immediately amenable to dynamic
programming algorithms. Our proof has two main technical contributions: First,
a polynomial upper bound on the number of symmetric secondary structures to be
considered when computing rotational symmetry penalties. Second, that bound is
leveraged by a backtracking algorithm to find the MFE in an exponential space
of contenders. Our MFE algorithm has the same asymptotic run time as Dirks et
al's partition function algorithm, suggesting efficient handling of rotational
symmetry, although higher space complexity. It also seems reasonably tight in
the number of strands since Codon, Hajiaghayi & Thachuk [DNA27, 2021] have
shown that unpseudoknotted MFE is NP-hard for O(N) strands.