用于 DNA 存储的串联代码的精确误差指数

arXiv - MATH - Information Theory Pub Date : 2024-09-02 DOI:arxiv-2409.01223

Yan Hao Ling, Jonathan Scarlett

{"title":"用于 DNA 存储的串联代码的精确误差指数","authors":"Yan Hao Ling, Jonathan Scarlett","doi":"arxiv-2409.01223","DOIUrl":null,"url":null,"abstract":"In this paper, we consider a concatenated coding based class of DNA storage\ncodes in which the selected molecules are constrained to be taken from an\n``inner'' codebook associated with the sequencing channel. This codebook is\nused in a ``black-box'' manner, and is only assumed to operate at an achievable\nrate in the sense of attaining asymptotically vanishing maximal (inner) error\nprobability. We first derive the exact error exponent in a widely-studied\nregime of constant rate and a linear number of sequencing reads, and show\nstrict improvements over an existing achievable error exponent. Moreover, our\nachievability analysis is based on a coded-index strategy, implying that such\nstrategies attain the highest error exponents within the broader class of codes\nthat we consider. We then extend our results to other scaling regimes,\nincluding a super-linear number of reads, as well as several certain low-rate\nregimes. We find that the latter comes with notable intricacies, such as the\nsuboptimality of codewords with all distinct molecules, and certain\ndependencies of the error exponents on the model for sequencing errors.","PeriodicalId":501082,"journal":{"name":"arXiv - MATH - Information Theory","volume":"37 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exact Error Exponents of Concatenated Codes for DNA Storage\",\"authors\":\"Yan Hao Ling, Jonathan Scarlett\",\"doi\":\"arxiv-2409.01223\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we consider a concatenated coding based class of DNA storage\\ncodes in which the selected molecules are constrained to be taken from an\\n``inner'' codebook associated with the sequencing channel. This codebook is\\nused in a ``black-box'' manner, and is only assumed to operate at an achievable\\nrate in the sense of attaining asymptotically vanishing maximal (inner) error\\nprobability. We first derive the exact error exponent in a widely-studied\\nregime of constant rate and a linear number of sequencing reads, and show\\nstrict improvements over an existing achievable error exponent. Moreover, our\\nachievability analysis is based on a coded-index strategy, implying that such\\nstrategies attain the highest error exponents within the broader class of codes\\nthat we consider. We then extend our results to other scaling regimes,\\nincluding a super-linear number of reads, as well as several certain low-rate\\nregimes. We find that the latter comes with notable intricacies, such as the\\nsuboptimality of codewords with all distinct molecules, and certain\\ndependencies of the error exponents on the model for sequencing errors.\",\"PeriodicalId\":501082,\"journal\":{\"name\":\"arXiv - MATH - Information Theory\",\"volume\":\"37 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Information Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.01223\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01223","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们考虑了一类基于串联编码的 DNA 存储编码，其中所选分子受限于从与测序信道相关的 "内部 "编码本中提取。该编码本以一种 "黑箱 "方式使用，并且只假定其在达到最大（内部）误差概率渐近消失的意义上运行。我们首先推导出在恒定速率和线性测序读数的广泛研究环境下的精确误差指数，并显示出与现有可实现误差指数相比的严格改进。此外，我们的可实现性分析是基于编码索引策略的，这意味着在我们考虑的更广泛的编码类别中，这种策略能实现最高的误差指数。然后，我们将结果扩展到其他扩展机制，包括超线性读取次数以及某些低速率机制。我们发现，后者也有值得注意的复杂性，例如所有不同分子的码字都是次优的，以及误差指数对测序误差模型的某些依赖性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exact Error Exponents of Concatenated Codes for DNA Storage

In this paper, we consider a concatenated coding based class of DNA storage codes in which the selected molecules are constrained to be taken from an ``inner'' codebook associated with the sequencing channel. This codebook is used in a ``black-box'' manner, and is only assumed to operate at an achievable rate in the sense of attaining asymptotically vanishing maximal (inner) error probability. We first derive the exact error exponent in a widely-studied regime of constant rate and a linear number of sequencing reads, and show strict improvements over an existing achievable error exponent. Moreover, our achievability analysis is based on a coded-index strategy, implying that such strategies attain the highest error exponents within the broader class of codes that we consider. We then extend our results to other scaling regimes, including a super-linear number of reads, as well as several certain low-rate regimes. We find that the latter comes with notable intricacies, such as the suboptimality of codewords with all distinct molecules, and certain dependencies of the error exponents on the model for sequencing errors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - MATH - Information Theory

自引率

0.00%

发文量