Geno-Weaving: A Framework for Low-Complexity Capacity-Achieving DNA Data Storage

IF 2.2
Hsin-Po Wang;Venkatesan Guruswami
{"title":"Geno-Weaving: A Framework for Low-Complexity Capacity-Achieving DNA Data Storage","authors":"Hsin-Po Wang;Venkatesan Guruswami","doi":"10.1109/JSAIT.2025.3610643","DOIUrl":null,"url":null,"abstract":"As a potential implementation of data storage using DNA molecules, multiple strands of DNA are stored unordered in a liquid container. When the data are needed, an array of DNA readers will sample the strands with replacement, producing a Poisson-distributed number of noisy reads for each strand. The primary challenge here is to design an algorithm that reconstructs data from these unsorted, repetitive, and noisy reads. In this paper, we lay down a capacity-achieving rateless code along each strand to encode its index; we then lay down a capacity-achieving block code at the same position across all strands to protect the data. These codes weave a low-complexity storage scheme that saturates the fundamental upper limit of DNA. This improves upon the previous work of Weinberger and Merhav, which proves said bound and uses high-complexity random codes to saturate the limit. Our scheme also differs from other concatenation-based implementations of DNA data storage in the sense that, instead of decoding the inner codes first and passing the results to the outer code, our decoder alternates between the rateless codes and the block codes.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"383-393"},"PeriodicalIF":2.2000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in information theory","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11165350/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

As a potential implementation of data storage using DNA molecules, multiple strands of DNA are stored unordered in a liquid container. When the data are needed, an array of DNA readers will sample the strands with replacement, producing a Poisson-distributed number of noisy reads for each strand. The primary challenge here is to design an algorithm that reconstructs data from these unsorted, repetitive, and noisy reads. In this paper, we lay down a capacity-achieving rateless code along each strand to encode its index; we then lay down a capacity-achieving block code at the same position across all strands to protect the data. These codes weave a low-complexity storage scheme that saturates the fundamental upper limit of DNA. This improves upon the previous work of Weinberger and Merhav, which proves said bound and uses high-complexity random codes to saturate the limit. Our scheme also differs from other concatenation-based implementations of DNA data storage in the sense that, instead of decoding the inner codes first and passing the results to the outer code, our decoder alternates between the rateless codes and the block codes.
基因编织:低复杂度的DNA数据存储框架
作为利用DNA分子进行数据存储的一种潜在实现,多股DNA被无序地存储在一个液体容器中。当需要数据时,一组DNA读取器将对替换的DNA链进行采样,为每条链产生一个泊松分布的噪声读取数。这里的主要挑战是设计一种算法,从这些无序、重复和嘈杂的读取中重建数据。在本文中,我们在每条链上设置了一个容量实现的无速率码来编码它的索引;然后,我们在所有链的同一位置设置一个容量实现块代码,以保护数据。这些编码编织了一个低复杂度的存储方案,使DNA的基本上限饱和。这改进了Weinberger和Merhav之前的工作,他们证明了上述界限,并使用高复杂度随机码来饱和极限。我们的方案也不同于其他基于串接的DNA数据存储实现,因为我们的解码器不是先解码内部代码并将结果传递给外部代码,而是在无速率代码和块代码之间交替进行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信