{"title":"Constructions and decoding of GC-balanced codes for edit errors","authors":"Kenan Wu, Shu Liu","doi":"10.1016/j.ffa.2024.102391","DOIUrl":null,"url":null,"abstract":"<div><p>DNA-based storage has been a promising technique of data storage, due to its high density and long duration. During synthesizing and sequencing of DNA storage, edit errors including insertions, deletions and substitutions are introduced inevitably. An effective way to reduce the error probability is to limit the content of G and C in DNA sequences to around 50%, which is called GC-balanced. To deal with edit errors, DNA sequences are also expected to have error-correcting capabilities. In this paper, GC globally balanced and GC locally balanced error-correcting codes are explicitly constructed, respectively. Inspired by repetition codes, the proposed codes are able to correct multiple edit errors. Furthermore, an efficient decoding algorithm applied for both codes is derived when only one kind of edit error occur.</p></div>","PeriodicalId":50446,"journal":{"name":"Finite Fields and Their Applications","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Finite Fields and Their Applications","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1071579724000303","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
DNA-based storage has been a promising technique of data storage, due to its high density and long duration. During synthesizing and sequencing of DNA storage, edit errors including insertions, deletions and substitutions are introduced inevitably. An effective way to reduce the error probability is to limit the content of G and C in DNA sequences to around 50%, which is called GC-balanced. To deal with edit errors, DNA sequences are also expected to have error-correcting capabilities. In this paper, GC globally balanced and GC locally balanced error-correcting codes are explicitly constructed, respectively. Inspired by repetition codes, the proposed codes are able to correct multiple edit errors. Furthermore, an efficient decoding algorithm applied for both codes is derived when only one kind of edit error occur.
基于 DNA 的存储因其密度高、持续时间长而成为一种前景广阔的数据存储技术。在 DNA 存储的合成和测序过程中,不可避免地会出现编辑错误,包括插入、删除和替换。减少错误概率的有效方法是将 DNA 序列中 G 和 C 的含量限制在 50%左右,这就是所谓的 GC 平衡。为了应对编辑错误,DNA 序列还需要具备纠错能力。本文分别明确构建了 GC 全局平衡纠错码和 GC 局部平衡纠错码。受重复编码的启发,所提出的编码能够纠正多重编辑错误。此外,本文还推导了一种适用于这两种编码的高效解码算法,该算法适用于只发生一种编辑错误的情况。
期刊介绍:
Finite Fields and Their Applications is a peer-reviewed technical journal publishing papers in finite field theory as well as in applications of finite fields. As a result of applications in a wide variety of areas, finite fields are increasingly important in several areas of mathematics, including linear and abstract algebra, number theory and algebraic geometry, as well as in computer science, statistics, information theory, and engineering.
For cohesion, and because so many applications rely on various theoretical properties of finite fields, it is essential that there be a core of high-quality papers on theoretical aspects. In addition, since much of the vitality of the area comes from computational problems, the journal publishes papers on computational aspects of finite fields as well as on algorithms and complexity of finite field-related methods.
The journal also publishes papers in various applications including, but not limited to, algebraic coding theory, cryptology, combinatorial design theory, pseudorandom number generation, and linear recurring sequences. There are other areas of application to be included, but the important point is that finite fields play a nontrivial role in the theory, application, or algorithm.