用于 DNA 存储地址设计的互不相关代码系列。

IF 4.4 4区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

IEEE Transactions on NanoBioscience Pub Date : 2025-01-16 DOI:10.1109/TNB.2025.3530470

Zhenlu Liu;Ben Cao;Qi Shao;Yanfen Zheng;Bin Wang;Shihua Zhou;Pan Zheng

{"title":"用于 DNA 存储地址设计的互不相关代码系列。","authors":"Zhenlu Liu;Ben Cao;Qi Shao;Yanfen Zheng;Bin Wang;Shihua Zhou;Pan Zheng","doi":"10.1109/TNB.2025.3530470","DOIUrl":null,"url":null,"abstract":"Deoxyribonucleic acid (DNA) has become an ideal medium for long-term storage and retrieval due to its extremely high storage density and long-term stability. But access efficiency is an existing bottleneck in DNA storage, especially the lack of high-quality random access address sequences. Therefore, in this paper, we report a series of approaches based on k-weakly mutually uncorrelated (k-WMU) codes to design the address sequence to improve the access efficiency of DNA storage. To address the problem of DNA sequences that are poorly scalable at the base level, we propose a 0-m-ruling coding scheme combined with k-WMU codes that can make address sequences avoid generating secondary structure with stem lengths ranging from 3 to 9. Based on the decoupled structure, We further extend the k-WMU codes with error correction function while satisfying combinatorial biological constraints. In order to investigate the performance of the designed address sequences for real-world applications, we perform simulation experiments based on thermodynamic properties and error correction capability as well as compared the minimum free energy (MFE), melting temperature (TM), and average decoding success rate (ADSR) with previous work. The results show that designed address sequences have a high MFE value and ADSR and a substantial reduction in TM-variance while satisfying the combinatorial biological constraints. As the quality of address sequences improves, this will help to achieve accurate random access as well as enhance the robustness of the DNA storage system.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"24 3","pages":"295-304"},"PeriodicalIF":4.4000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Family of Mutually Uncorrelated Codes for DNA Storage Address Design\",\"authors\":\"Zhenlu Liu;Ben Cao;Qi Shao;Yanfen Zheng;Bin Wang;Shihua Zhou;Pan Zheng\",\"doi\":\"10.1109/TNB.2025.3530470\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deoxyribonucleic acid (DNA) has become an ideal medium for long-term storage and retrieval due to its extremely high storage density and long-term stability. But access efficiency is an existing bottleneck in DNA storage, especially the lack of high-quality random access address sequences. Therefore, in this paper, we report a series of approaches based on k-weakly mutually uncorrelated (k-WMU) codes to design the address sequence to improve the access efficiency of DNA storage. To address the problem of DNA sequences that are poorly scalable at the base level, we propose a 0-m-ruling coding scheme combined with k-WMU codes that can make address sequences avoid generating secondary structure with stem lengths ranging from 3 to 9. Based on the decoupled structure, We further extend the k-WMU codes with error correction function while satisfying combinatorial biological constraints. In order to investigate the performance of the designed address sequences for real-world applications, we perform simulation experiments based on thermodynamic properties and error correction capability as well as compared the minimum free energy (MFE), melting temperature (TM), and average decoding success rate (ADSR) with previous work. The results show that designed address sequences have a high MFE value and ADSR and a substantial reduction in TM-variance while satisfying the combinatorial biological constraints. As the quality of address sequences improves, this will help to achieve accurate random access as well as enhance the robustness of the DNA storage system.\",\"PeriodicalId\":13264,\"journal\":{\"name\":\"IEEE Transactions on NanoBioscience\",\"volume\":\"24 3\",\"pages\":\"295-304\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-01-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on NanoBioscience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10843767/\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on NanoBioscience","FirstCategoryId":"99","ListUrlMain":"https://ieeexplore.ieee.org/document/10843767/","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

脱氧核糖核酸（DNA）具有极高的存储密度和长期稳定性，已成为长期存储和检索的理想介质。但存取效率是目前 DNA 存储的一个瓶颈，尤其是缺乏高质量的随机存取地址序列。因此，本文报告了一系列基于k-弱互不相关（k-WMU）码设计地址序列的方法，以提高DNA存储的访问效率。为了解决DNA序列在碱基水平上可扩展性差的问题，我们提出了一种0-m-ruling编码方案，结合k-WMU码，可以使地址序列避免产生茎长度在3到9之间的二级结构。在解耦结构的基础上，我们进一步扩展了具有纠错功能的 k-WMU 编码，同时满足了组合生物约束。为了研究设计的地址序列在实际应用中的性能，我们根据热力学特性和纠错能力进行了模拟实验，并将最小自由能（MFE）、熔化温度（TM）和平均解码成功率（ADSR）与之前的研究进行了比较。结果表明，所设计的地址序列具有较高的 MFE 值和 ADSR，并在满足组合生物约束的同时大幅降低了 TM 变异。随着地址序列质量的提高，这将有助于实现精确的随机存取，并增强 DNA 存储系统的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Family of Mutually Uncorrelated Codes for DNA Storage Address Design

Deoxyribonucleic acid (DNA) has become an ideal medium for long-term storage and retrieval due to its extremely high storage density and long-term stability. But access efficiency is an existing bottleneck in DNA storage, especially the lack of high-quality random access address sequences. Therefore, in this paper, we report a series of approaches based on k-weakly mutually uncorrelated (k-WMU) codes to design the address sequence to improve the access efficiency of DNA storage. To address the problem of DNA sequences that are poorly scalable at the base level, we propose a 0-m-ruling coding scheme combined with k-WMU codes that can make address sequences avoid generating secondary structure with stem lengths ranging from 3 to 9. Based on the decoupled structure, We further extend the k-WMU codes with error correction function while satisfying combinatorial biological constraints. In order to investigate the performance of the designed address sequences for real-world applications, we perform simulation experiments based on thermodynamic properties and error correction capability as well as compared the minimum free energy (MFE), melting temperature (TM), and average decoding success rate (ADSR) with previous work. The results show that designed address sequences have a high MFE value and ADSR and a substantial reduction in TM-variance while satisfying the combinatorial biological constraints. As the quality of address sequences improves, this will help to achieve accurate random access as well as enhance the robustness of the DNA storage system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on NanoBioscience 工程技术-纳米科技

CiteScore

7.00

自引率

5.10%

发文量

197

审稿时长

>12 weeks

期刊介绍： The IEEE Transactions on NanoBioscience reports on original, innovative and interdisciplinary work on all aspects of molecular systems, cellular systems, and tissues (including molecular electronics). Topics covered in the journal focus on a broad spectrum of aspects, both on foundations and on applications. Specifically, methods and techniques, experimental aspects, design and implementation, instrumentation and laboratory equipment, clinical aspects, hardware and software data acquisition and analysis and computer based modelling are covered (based on traditional or high performance computing - parallel computers or computer networks).