Coding for Efficient DNA Synthesis

2020 IEEE International Symposium on Information Theory (ISIT) Pub Date : 2020-06-01 DOI:10.1109/ISIT44484.2020.9174272

A. Lenz, Yi Liu, Cyrus Rashtchian, P. Siegel, A. Wachter-Zeh, Eitan Yaakobi

引用次数: 16

Abstract

For DNA data storage to become a feasible technology, all aspects of the encoding and decoding pipeline must be optimized. Writing the data into DNA, which is known as DNA synthesis, is currently the most costly part of existing storage systems. As a step toward more efficient synthesis, we study the design of codes that minimize the time and number of required materials needed to produce the DNA strands. We consider a popular synthesis process that builds many strands in parallel in a step-by-step fashion using a fixed supersequence S. The machine iterates through S one nucleotide at a time, and in each cycle, it adds the next nucleotide to a subset of the strands. The synthesis time is determined by the length of S. We show that by introducing redundancy to the synthesized strands, we can significantly decrease the number of synthesis cycles. We derive the maximum amount of information per synthesis cycle assuming S is an arbitrary periodic sequence. To prove our results, we exhibit new connections to cost-constrained codes.

查看原文本刊更多论文

高效DNA合成编码

为了使DNA数据存储成为一种可行的技术，必须对编码和解码管道的各个方面进行优化。将数据写入DNA，即DNA合成，是目前现有存储系统中最昂贵的部分。作为迈向更高效合成的一步，我们研究了编码的设计，以最大限度地减少产生DNA链所需的时间和所需材料的数量。我们考虑一种流行的合成过程，使用固定的超序列S，以一步一步的方式并行构建许多链。机器一次迭代一个核苷酸，在每个循环中，它将下一个核苷酸添加到链的子集中。合成时间由s的长度决定。我们表明，通过在合成链中引入冗余，我们可以显著减少合成周期的数量。假设S是任意周期序列，我们推导出每个合成周期的最大信息量。为了证明我们的结果，我们展示了与成本约束代码的新连接。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Symposium on Information Theory (ISIT)

自引率

0.00%

发文量