{"title":"基于数据权重的 DNA 存储平衡编码新构建","authors":"Xiaozhou Lu;Sunghwan Kim","doi":"10.1109/TETC.2023.3293477","DOIUrl":null,"url":null,"abstract":"As maintaining a proper balanced GC content is crucial for minimizing errors in DNA storage, constructing GC-balanced DNA codes has become an important research topic. In this article, we propose a novel code construction method based on the weight distribution of the data, which enables us to construct GC-balanced DNA codes. Additionally, we introduce a specific encoding process for both balanced and imbalanced data parts. One of the key differences between the proposed codes and existing codes is that the parity lengths of the proposed codes are variable depending on the data parts, while the parity lengths of existing codes remain fixed. To evaluate the effectiveness of the proposed codes, we compare their average parity lengths to those of existing codes. Our results demonstrate that the proposed codes have significantly shorter average parity lengths for DNA sequences with appropriate GC contents.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 4","pages":"973-984"},"PeriodicalIF":5.1000,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"New Construction of Balanced Codes Based on Weights of Data for DNA Storage\",\"authors\":\"Xiaozhou Lu;Sunghwan Kim\",\"doi\":\"10.1109/TETC.2023.3293477\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As maintaining a proper balanced GC content is crucial for minimizing errors in DNA storage, constructing GC-balanced DNA codes has become an important research topic. In this article, we propose a novel code construction method based on the weight distribution of the data, which enables us to construct GC-balanced DNA codes. Additionally, we introduce a specific encoding process for both balanced and imbalanced data parts. One of the key differences between the proposed codes and existing codes is that the parity lengths of the proposed codes are variable depending on the data parts, while the parity lengths of existing codes remain fixed. To evaluate the effectiveness of the proposed codes, we compare their average parity lengths to those of existing codes. Our results demonstrate that the proposed codes have significantly shorter average parity lengths for DNA sequences with appropriate GC contents.\",\"PeriodicalId\":13156,\"journal\":{\"name\":\"IEEE Transactions on Emerging Topics in Computing\",\"volume\":\"11 4\",\"pages\":\"973-984\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2023-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Emerging Topics in Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10183852/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10183852/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
保持适当平衡的 GC 含量对于减少 DNA 存储中的错误至关重要,因此构建 GC 平衡 DNA 代码已成为一个重要的研究课题。在本文中,我们提出了一种基于数据权重分布的新型代码构建方法,它使我们能够构建 GC 平衡 DNA 代码。此外,我们还为平衡和不平衡数据部分引入了特定的编码过程。拟议代码与现有代码的主要区别之一是,拟议代码的奇偶校验长度可根据数据部分的不同而变化,而现有代码的奇偶校验长度则保持固定。为了评估拟议编码的有效性,我们将其平均奇偶校验长度与现有编码的平均奇偶校验长度进行了比较。结果表明,对于具有适当 GC 含量的 DNA 序列,建议的编码具有明显较短的平均奇偶校验长度。
New Construction of Balanced Codes Based on Weights of Data for DNA Storage
As maintaining a proper balanced GC content is crucial for minimizing errors in DNA storage, constructing GC-balanced DNA codes has become an important research topic. In this article, we propose a novel code construction method based on the weight distribution of the data, which enables us to construct GC-balanced DNA codes. Additionally, we introduce a specific encoding process for both balanced and imbalanced data parts. One of the key differences between the proposed codes and existing codes is that the parity lengths of the proposed codes are variable depending on the data parts, while the parity lengths of existing codes remain fixed. To evaluate the effectiveness of the proposed codes, we compare their average parity lengths to those of existing codes. Our results demonstrate that the proposed codes have significantly shorter average parity lengths for DNA sequences with appropriate GC contents.
期刊介绍:
IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.