Neural Polar Decoders for DNA Data Storage

IF 2.2
Ziv Aharoni;Henry D. Pfister
{"title":"Neural Polar Decoders for DNA Data Storage","authors":"Ziv Aharoni;Henry D. Pfister","doi":"10.1109/JSAIT.2025.3610751","DOIUrl":null,"url":null,"abstract":"Synchronization errors, arising from both synthesis and sequencing noise, present a fundamental challenge in DNA-based data storage systems. These errors are often modeled as insertion-deletion-substitution (IDS) channels, for which maximum-likelihood decoding is quite computationally expensive. In this work, we propose a data-driven approach based on neural polar decoders (NPDs) to design decoders with reduced complexity for channels with synchronization errors. The proposed architecture enables decoding over IDS channels with reduced complexity <inline-formula> <tex-math>$O(A N \\log N)$ </tex-math></inline-formula>, where <inline-formula> <tex-math>$A$ </tex-math></inline-formula> is a tunable parameter independent of the channel. NPDs require only sample access to the channel and can be trained without an explicit channel model. Additionally, NPDs provide mutual information (MI) estimates that can be used to optimize input distributions and code design. We demonstrate the effectiveness of NPDs on both synthetic deletion and IDS channels. For deletion channels, we show that NPDs achieve near-optimal decoding performance and accurate MI estimation, with significantly lower complexity than trellis-based decoders. We also provide numerical estimates of the channel capacity for the deletion channel. We extend our evaluation to realistic DNA storage settings, including channels with multiple noisy reads and real-world Nanopore sequencing data. Our results show that NPDs match or surpass the performance of existing methods while using significantly fewer parameters than the state-of-the-art. These findings highlight the promise of NPDs for robust and efficient decoding in DNA data storage systems.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"403-416"},"PeriodicalIF":2.2000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in information theory","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11165383/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Synchronization errors, arising from both synthesis and sequencing noise, present a fundamental challenge in DNA-based data storage systems. These errors are often modeled as insertion-deletion-substitution (IDS) channels, for which maximum-likelihood decoding is quite computationally expensive. In this work, we propose a data-driven approach based on neural polar decoders (NPDs) to design decoders with reduced complexity for channels with synchronization errors. The proposed architecture enables decoding over IDS channels with reduced complexity $O(A N \log N)$ , where $A$ is a tunable parameter independent of the channel. NPDs require only sample access to the channel and can be trained without an explicit channel model. Additionally, NPDs provide mutual information (MI) estimates that can be used to optimize input distributions and code design. We demonstrate the effectiveness of NPDs on both synthetic deletion and IDS channels. For deletion channels, we show that NPDs achieve near-optimal decoding performance and accurate MI estimation, with significantly lower complexity than trellis-based decoders. We also provide numerical estimates of the channel capacity for the deletion channel. We extend our evaluation to realistic DNA storage settings, including channels with multiple noisy reads and real-world Nanopore sequencing data. Our results show that NPDs match or surpass the performance of existing methods while using significantly fewer parameters than the state-of-the-art. These findings highlight the promise of NPDs for robust and efficient decoding in DNA data storage systems.
用于DNA数据存储的神经极性解码器
由合成噪声和测序噪声引起的同步误差是基于dna的数据存储系统面临的一个基本挑战。这些错误通常被建模为插入-删除-替换(IDS)通道,对于这些通道,最大似然解码在计算上非常昂贵。在这项工作中,我们提出了一种基于神经极性解码器(npd)的数据驱动方法,用于设计具有同步错误的信道的解码器,降低了解码器的复杂性。所提出的体系结构使IDS信道上的解码具有较低的复杂度$O(A N \log N)$,其中$A$是一个独立于信道的可调参数。npd只需要访问通道的样本,并且可以在没有显式通道模型的情况下进行训练。此外,npd提供可用于优化输入分布和代码设计的互信息(MI)估计。我们证明了npd在合成缺失和IDS通道上的有效性。对于删除信道,我们表明npd实现了近乎最佳的解码性能和准确的MI估计,其复杂性明显低于基于网格的解码器。我们还提供了删除信道的信道容量的数值估计。我们将我们的评估扩展到现实的DNA存储设置,包括具有多个噪声读取的通道和真实的纳米孔测序数据。我们的研究结果表明,npd在使用比最先进的参数少得多的情况下,达到或超过了现有方法的性能。这些发现突出了npd在DNA数据存储系统中稳健和高效解码的前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书