Coding for Efficient DNA Synthesis

A. Lenz, Yi Liu, Cyrus Rashtchian, P. Siegel, A. Wachter-Zeh, Eitan Yaakobi
{"title":"Coding for Efficient DNA Synthesis","authors":"A. Lenz, Yi Liu, Cyrus Rashtchian, P. Siegel, A. Wachter-Zeh, Eitan Yaakobi","doi":"10.1109/ISIT44484.2020.9174272","DOIUrl":null,"url":null,"abstract":"For DNA data storage to become a feasible technology, all aspects of the encoding and decoding pipeline must be optimized. Writing the data into DNA, which is known as DNA synthesis, is currently the most costly part of existing storage systems. As a step toward more efficient synthesis, we study the design of codes that minimize the time and number of required materials needed to produce the DNA strands. We consider a popular synthesis process that builds many strands in parallel in a step-by-step fashion using a fixed supersequence S. The machine iterates through S one nucleotide at a time, and in each cycle, it adds the next nucleotide to a subset of the strands. The synthesis time is determined by the length of S. We show that by introducing redundancy to the synthesized strands, we can significantly decrease the number of synthesis cycles. We derive the maximum amount of information per synthesis cycle assuming S is an arbitrary periodic sequence. To prove our results, we exhibit new connections to cost-constrained codes.","PeriodicalId":159311,"journal":{"name":"2020 IEEE International Symposium on Information Theory (ISIT)","volume":"41 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Information Theory (ISIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIT44484.2020.9174272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

For DNA data storage to become a feasible technology, all aspects of the encoding and decoding pipeline must be optimized. Writing the data into DNA, which is known as DNA synthesis, is currently the most costly part of existing storage systems. As a step toward more efficient synthesis, we study the design of codes that minimize the time and number of required materials needed to produce the DNA strands. We consider a popular synthesis process that builds many strands in parallel in a step-by-step fashion using a fixed supersequence S. The machine iterates through S one nucleotide at a time, and in each cycle, it adds the next nucleotide to a subset of the strands. The synthesis time is determined by the length of S. We show that by introducing redundancy to the synthesized strands, we can significantly decrease the number of synthesis cycles. We derive the maximum amount of information per synthesis cycle assuming S is an arbitrary periodic sequence. To prove our results, we exhibit new connections to cost-constrained codes.
高效DNA合成编码
为了使DNA数据存储成为一种可行的技术,必须对编码和解码管道的各个方面进行优化。将数据写入DNA,即DNA合成,是目前现有存储系统中最昂贵的部分。作为迈向更高效合成的一步,我们研究了编码的设计,以最大限度地减少产生DNA链所需的时间和所需材料的数量。我们考虑一种流行的合成过程,使用固定的超序列S,以一步一步的方式并行构建许多链。机器一次迭代一个核苷酸,在每个循环中,它将下一个核苷酸添加到链的子集中。合成时间由s的长度决定。我们表明,通过在合成链中引入冗余,我们可以显著减少合成周期的数量。假设S是任意周期序列,我们推导出每个合成周期的最大信息量。为了证明我们的结果,我们展示了与成本约束代码的新连接。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信