吉布斯能量景观和DNA寡核苷酸设计的几何方法

M. Garzon, Kiran C. Bobba
{"title":"吉布斯能量景观和DNA寡核苷酸设计的几何方法","authors":"M. Garzon, Kiran C. Bobba","doi":"10.4018/ijnmc.2011070104","DOIUrl":null,"url":null,"abstract":"DNA codeword design has been a fundamental problem since the early days of DNA computing. The problem calls for finding large sets of single DNA strands that do not crosshybridize to themselves, to each other or to others’ complements. Such strands represent so-called domains, particularly in the language of chemical reaction networks (CRNs). The problem has shown to be of interest in other areas as well, including DNA memories and phylogenetic analyses because of their error correction and prevention properties. In prior work, a theoretical framework to analyze this problem has been developed and natural and simple versions of Codeword design have been shown to be NP-complete using any single reasonable metric that approximates the Gibbs energy, thus practically making it very difficult to find any general procedure for finding such maximal sets exactly and efficiently. In this framework, codeword design is partially reduced to finding large sets of strands maximally separated in DNA spaces and, therefore, the size of such sets depends on the geometry of these spaces. Here, the authors describe in detail a new general technique to embed them in Euclidean spaces in such a way that oligonucleotides with high (low, respectively) hybridization affinity are mapped to neighboring (remote, respectively) points in a geometric lattice. This embedding materializes long-held metaphors about codeword design in analogies with error-correcting code design in information theory in terms of sphere packing and leads to designs that are in some cases known to be provably nearly optimal for small oligonucleotide sizes, whenever the corresponding spherical codes in Euclidean spaces are known to be so. It also leads to upper and lower bounds on estimates of the size of optimal codes of size under 20-mers, as well as to a few infinite families of DNA strand lengths, based on estimates of the kissing (or contact) number for sphere codes in high-dimensional Euclidean spaces. Conversely, the authors show how solutions to DNA codeword design obtained by experimental or other means can also provide solutions to difficult spherical packing geometric problems via these approaches. Finally, the reduction suggests a tool to provide some insight into the approximate structure of the Gibbs energy landscapes, which play a primary role in the design and implementation of biomolecular programs. Geometric Approaches to Gibbs Energy Landscapes and DNA Oligonucleotide Design","PeriodicalId":259233,"journal":{"name":"Int. J. Nanotechnol. Mol. Comput.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Geometric Approaches to Gibbs Energy Landscapes and DNA Oligonucleotide Design\",\"authors\":\"M. Garzon, Kiran C. Bobba\",\"doi\":\"10.4018/ijnmc.2011070104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"DNA codeword design has been a fundamental problem since the early days of DNA computing. The problem calls for finding large sets of single DNA strands that do not crosshybridize to themselves, to each other or to others’ complements. Such strands represent so-called domains, particularly in the language of chemical reaction networks (CRNs). The problem has shown to be of interest in other areas as well, including DNA memories and phylogenetic analyses because of their error correction and prevention properties. In prior work, a theoretical framework to analyze this problem has been developed and natural and simple versions of Codeword design have been shown to be NP-complete using any single reasonable metric that approximates the Gibbs energy, thus practically making it very difficult to find any general procedure for finding such maximal sets exactly and efficiently. In this framework, codeword design is partially reduced to finding large sets of strands maximally separated in DNA spaces and, therefore, the size of such sets depends on the geometry of these spaces. Here, the authors describe in detail a new general technique to embed them in Euclidean spaces in such a way that oligonucleotides with high (low, respectively) hybridization affinity are mapped to neighboring (remote, respectively) points in a geometric lattice. This embedding materializes long-held metaphors about codeword design in analogies with error-correcting code design in information theory in terms of sphere packing and leads to designs that are in some cases known to be provably nearly optimal for small oligonucleotide sizes, whenever the corresponding spherical codes in Euclidean spaces are known to be so. It also leads to upper and lower bounds on estimates of the size of optimal codes of size under 20-mers, as well as to a few infinite families of DNA strand lengths, based on estimates of the kissing (or contact) number for sphere codes in high-dimensional Euclidean spaces. Conversely, the authors show how solutions to DNA codeword design obtained by experimental or other means can also provide solutions to difficult spherical packing geometric problems via these approaches. Finally, the reduction suggests a tool to provide some insight into the approximate structure of the Gibbs energy landscapes, which play a primary role in the design and implementation of biomolecular programs. Geometric Approaches to Gibbs Energy Landscapes and DNA Oligonucleotide Design\",\"PeriodicalId\":259233,\"journal\":{\"name\":\"Int. J. Nanotechnol. Mol. Comput.\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Nanotechnol. Mol. Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/ijnmc.2011070104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Nanotechnol. Mol. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijnmc.2011070104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

从DNA计算的早期开始,DNA码字设计就一直是一个基本问题。这个问题需要找到大量的单链DNA链,这些DNA链不会与自身、彼此或其他互补体交叉杂交。这些链代表了所谓的区域,特别是在化学反应网络(crn)的语言中。这个问题在其他领域也引起了人们的兴趣,包括DNA记忆和系统发育分析,因为它们具有纠错和预防的特性。在先前的工作中,已经开发了一个理论框架来分析这个问题,并且使用任何近似吉布斯能量的合理度量,自然和简单的码字设计版本已经被证明是np完全的,因此实际上很难找到任何精确有效地找到这种极大集的一般过程。在这个框架中,码字设计部分简化为寻找在DNA空间中最大程度分离的大链集,因此,这些集的大小取决于这些空间的几何形状。在这里,作者详细描述了一种新的通用技术,将它们嵌入欧几里得空间,这样具有高(低,分别)杂交亲和的寡核苷酸被映射到几何晶格中的邻近(分别为远程)点。这种嵌入实现了长期以来关于码字设计的隐喻,类似于信息理论中关于球体填充的纠错码设计,并导致在某些情况下已知对于小寡核苷酸大小的设计几乎是最优的,只要已知欧几里得空间中相应的球形码是如此。基于高维欧几里得空间中球码的亲吻(或接触)数的估计,它还导致了20米以下最优码的大小估计的上界和下界,以及一些无限DNA链长度家族。相反,作者展示了通过实验或其他方法获得的DNA码字设计的解决方案如何也可以通过这些方法提供复杂的球形填充几何问题的解决方案。最后,这一还原提供了一种工具,可以深入了解吉布斯能量景观的近似结构,吉布斯能量景观在生物分子程序的设计和实施中起着主要作用。吉布斯能量景观和DNA寡核苷酸设计的几何方法
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Geometric Approaches to Gibbs Energy Landscapes and DNA Oligonucleotide Design
DNA codeword design has been a fundamental problem since the early days of DNA computing. The problem calls for finding large sets of single DNA strands that do not crosshybridize to themselves, to each other or to others’ complements. Such strands represent so-called domains, particularly in the language of chemical reaction networks (CRNs). The problem has shown to be of interest in other areas as well, including DNA memories and phylogenetic analyses because of their error correction and prevention properties. In prior work, a theoretical framework to analyze this problem has been developed and natural and simple versions of Codeword design have been shown to be NP-complete using any single reasonable metric that approximates the Gibbs energy, thus practically making it very difficult to find any general procedure for finding such maximal sets exactly and efficiently. In this framework, codeword design is partially reduced to finding large sets of strands maximally separated in DNA spaces and, therefore, the size of such sets depends on the geometry of these spaces. Here, the authors describe in detail a new general technique to embed them in Euclidean spaces in such a way that oligonucleotides with high (low, respectively) hybridization affinity are mapped to neighboring (remote, respectively) points in a geometric lattice. This embedding materializes long-held metaphors about codeword design in analogies with error-correcting code design in information theory in terms of sphere packing and leads to designs that are in some cases known to be provably nearly optimal for small oligonucleotide sizes, whenever the corresponding spherical codes in Euclidean spaces are known to be so. It also leads to upper and lower bounds on estimates of the size of optimal codes of size under 20-mers, as well as to a few infinite families of DNA strand lengths, based on estimates of the kissing (or contact) number for sphere codes in high-dimensional Euclidean spaces. Conversely, the authors show how solutions to DNA codeword design obtained by experimental or other means can also provide solutions to difficult spherical packing geometric problems via these approaches. Finally, the reduction suggests a tool to provide some insight into the approximate structure of the Gibbs energy landscapes, which play a primary role in the design and implementation of biomolecular programs. Geometric Approaches to Gibbs Energy Landscapes and DNA Oligonucleotide Design
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信