基于LZW和算术编码的DNA联合编码方法

IF 2.3 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Zhongyang Cheng;Qiang Liu;Kun Yang
{"title":"基于LZW和算术编码的DNA联合编码方法","authors":"Zhongyang Cheng;Qiang Liu;Kun Yang","doi":"10.1109/TMBMC.2025.3556858","DOIUrl":null,"url":null,"abstract":"Molecular communication (MC) represents a novel approach to communication that employs nanoengineering and bioengineering technology to establish transient communication links in challenging environments. Deoxyribonucleic acid (DNA) molecular communication can transmit more and faster data than traditional molecular communication. Deoxyribonucleic acid (DNA) has been demonstrated to offer significant advantages over traditional information carriers, including its excellent storage density and structural stability, which renders it an ideal medium for information transmission. It is therefore imperative to investigate methods of increasing the data information density of DNA in order to reduce costs and enhance overall performance. LZW encoding is Lempel-Ziv–Welch encoding which creates a string table with shorter codes representing longer strings. Arithmetic coding is a compression process that involves the continuous refinement of probabilities of the input stream within an interval. A notable drawback of LZW coding is its suboptimal compression efficiency and the presence of data redundancy after dictionary mapping. Conversely, arithmetic coding attains compression efficiency that approaches the Shannon limit. In this study, we propose a novel DNA encoding method which is capable of adaptively generating coding streams in accordance with the characteristics of the stored content. The contribution of this paper is as follows: 1) A bespoke coding dictionary is constructed, which is capable of intelligently generating the corresponding coding stream in accordance with the specific characteristics of the file to be stored. 2) Utilising arithmetic coding techniques, these coding streams are converted into the final DNA sequence by means of compression techniques. Following comprehensive verification, it has been established that the information density of this encoding method is markedly superior to that of the prevailing mainstream encoding schemes.","PeriodicalId":36530,"journal":{"name":"IEEE Transactions on Molecular, Biological, and Multi-Scale Communications","volume":"11 2","pages":"237-245"},"PeriodicalIF":2.3000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Joint DNA Encoding Approach Based on LZW and Arithmetic Encoding\",\"authors\":\"Zhongyang Cheng;Qiang Liu;Kun Yang\",\"doi\":\"10.1109/TMBMC.2025.3556858\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Molecular communication (MC) represents a novel approach to communication that employs nanoengineering and bioengineering technology to establish transient communication links in challenging environments. Deoxyribonucleic acid (DNA) molecular communication can transmit more and faster data than traditional molecular communication. Deoxyribonucleic acid (DNA) has been demonstrated to offer significant advantages over traditional information carriers, including its excellent storage density and structural stability, which renders it an ideal medium for information transmission. It is therefore imperative to investigate methods of increasing the data information density of DNA in order to reduce costs and enhance overall performance. LZW encoding is Lempel-Ziv–Welch encoding which creates a string table with shorter codes representing longer strings. Arithmetic coding is a compression process that involves the continuous refinement of probabilities of the input stream within an interval. A notable drawback of LZW coding is its suboptimal compression efficiency and the presence of data redundancy after dictionary mapping. Conversely, arithmetic coding attains compression efficiency that approaches the Shannon limit. In this study, we propose a novel DNA encoding method which is capable of adaptively generating coding streams in accordance with the characteristics of the stored content. The contribution of this paper is as follows: 1) A bespoke coding dictionary is constructed, which is capable of intelligently generating the corresponding coding stream in accordance with the specific characteristics of the file to be stored. 2) Utilising arithmetic coding techniques, these coding streams are converted into the final DNA sequence by means of compression techniques. Following comprehensive verification, it has been established that the information density of this encoding method is markedly superior to that of the prevailing mainstream encoding schemes.\",\"PeriodicalId\":36530,\"journal\":{\"name\":\"IEEE Transactions on Molecular, Biological, and Multi-Scale Communications\",\"volume\":\"11 2\",\"pages\":\"237-245\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Molecular, Biological, and Multi-Scale Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10948464/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Molecular, Biological, and Multi-Scale Communications","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10948464/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

分子通信(MC)是一种利用纳米工程和生物工程技术在复杂环境中建立瞬时通信链路的新型通信方法。脱氧核糖核酸(DNA)分子通信可以比传统的分子通信传输更多更快的数据。脱氧核糖核酸(DNA)已被证明比传统的信息载体具有显著的优势,包括其优异的存储密度和结构稳定性,使其成为信息传输的理想介质。因此,研究提高DNA数据信息密度的方法以降低成本和提高整体性能是势在必行的。LZW编码是Lempel-Ziv-Welch编码,它创建一个字符串表,用较短的代码表示较长的字符串。算术编码是一种压缩过程,它涉及在一定间隔内对输入流的概率进行连续细化。LZW编码的一个显著缺点是其次优压缩效率和字典映射后数据冗余的存在。相反,算术编码的压缩效率接近香农极限。在这项研究中,我们提出了一种新的DNA编码方法,该方法能够根据存储内容的特征自适应地产生编码流。本文的贡献如下:1)构建了一个定制编码字典,该字典能够根据待存储文件的具体特征智能生成相应的编码流。2)利用算术编码技术,通过压缩技术将这些编码流转换为最终的DNA序列。经过综合验证,该编码方法的信息密度明显优于目前主流的编码方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Joint DNA Encoding Approach Based on LZW and Arithmetic Encoding
Molecular communication (MC) represents a novel approach to communication that employs nanoengineering and bioengineering technology to establish transient communication links in challenging environments. Deoxyribonucleic acid (DNA) molecular communication can transmit more and faster data than traditional molecular communication. Deoxyribonucleic acid (DNA) has been demonstrated to offer significant advantages over traditional information carriers, including its excellent storage density and structural stability, which renders it an ideal medium for information transmission. It is therefore imperative to investigate methods of increasing the data information density of DNA in order to reduce costs and enhance overall performance. LZW encoding is Lempel-Ziv–Welch encoding which creates a string table with shorter codes representing longer strings. Arithmetic coding is a compression process that involves the continuous refinement of probabilities of the input stream within an interval. A notable drawback of LZW coding is its suboptimal compression efficiency and the presence of data redundancy after dictionary mapping. Conversely, arithmetic coding attains compression efficiency that approaches the Shannon limit. In this study, we propose a novel DNA encoding method which is capable of adaptively generating coding streams in accordance with the characteristics of the stored content. The contribution of this paper is as follows: 1) A bespoke coding dictionary is constructed, which is capable of intelligently generating the corresponding coding stream in accordance with the specific characteristics of the file to be stored. 2) Utilising arithmetic coding techniques, these coding streams are converted into the final DNA sequence by means of compression techniques. Following comprehensive verification, it has been established that the information density of this encoding method is markedly superior to that of the prevailing mainstream encoding schemes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.90
自引率
13.60%
发文量
23
期刊介绍: As a result of recent advances in MEMS/NEMS and systems biology, as well as the emergence of synthetic bacteria and lab/process-on-a-chip techniques, it is now possible to design chemical “circuits”, custom organisms, micro/nanoscale swarms of devices, and a host of other new systems. This success opens up a new frontier for interdisciplinary communications techniques using chemistry, biology, and other principles that have not been considered in the communications literature. The IEEE Transactions on Molecular, Biological, and Multi-Scale Communications (T-MBMSC) is devoted to the principles, design, and analysis of communication systems that use physics beyond classical electromagnetism. This includes molecular, quantum, and other physical, chemical and biological techniques; as well as new communication techniques at small scales or across multiple scales (e.g., nano to micro to macro; note that strictly nanoscale systems, 1-100 nm, are outside the scope of this journal). Original research articles on one or more of the following topics are within scope: mathematical modeling, information/communication and network theoretic analysis, standardization and industrial applications, and analytical or experimental studies on communication processes or networks in biology. Contributions on related topics may also be considered for publication. Contributions from researchers outside the IEEE’s typical audience are encouraged.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信