Yesenia Cevallos, Luis Tello-Oquendo, Deysi Inca, Nicolay Samaniego, Ivone Santillán, A. Shirazi, Guillermo A. Gomez
{"title":"基于dna的数据存储中高效的数字编码表示方法研究","authors":"Yesenia Cevallos, Luis Tello-Oquendo, Deysi Inca, Nicolay Samaniego, Ivone Santillán, A. Shirazi, Guillermo A. Gomez","doi":"10.1145/3411295.3411314","DOIUrl":null,"url":null,"abstract":"Deoxyribonucleic acid (DNA), the life molecule, is composed of four nucleotides: Adenine, Guanine, Cytosine, and Thymine. The combinations of these nucleotides in the DNA encode the 20 amino acids that generate the structure of living organisms. These discrete components, jointly with the characteristics and functions of DNA, allow understanding the DNA as a digital component. Thus, when DNA is considered an organic digital memory, it becomes a compelling data storage medium given its superior density, stability, energy efficiency, longevity, and lack of foreseeable technical obsolescence compared with conventional electronic media. Various challenging experiments have demonstrated that digital information can be written in DNA, stored, and accurately read. Besides, due to the digital DNA nature, there is a trend to associate the DNA information (6 bits per amino acid) with typical digital codes for information representation (8 bits). Therefore, we propose to use a series of 48 bits to encode the digital information of a host into DNA representation. This representation is appropriate in end-to-end digital communication systems since (i) it introduces a digital code regardless of the computer's architecture, and (ii) it can be used as a \"common format\" for \"bio host-bio transmitter\" with both the advantages of DNA as a storage medium and the effective methods to compress DNA information to save the transmission medium bandwidth.","PeriodicalId":93611,"journal":{"name":"Proceedings of the 7th ACM International Conference on Nanoscale Computing and Communication : Virtual Conference, September 23-25, 2020 : NanoCom 2020. ACM International Conference on Nanoscale Computing and Communication (7th : 2020 :...","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"On the efficient digital code representation in DNA-based data storage\",\"authors\":\"Yesenia Cevallos, Luis Tello-Oquendo, Deysi Inca, Nicolay Samaniego, Ivone Santillán, A. Shirazi, Guillermo A. Gomez\",\"doi\":\"10.1145/3411295.3411314\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deoxyribonucleic acid (DNA), the life molecule, is composed of four nucleotides: Adenine, Guanine, Cytosine, and Thymine. The combinations of these nucleotides in the DNA encode the 20 amino acids that generate the structure of living organisms. These discrete components, jointly with the characteristics and functions of DNA, allow understanding the DNA as a digital component. Thus, when DNA is considered an organic digital memory, it becomes a compelling data storage medium given its superior density, stability, energy efficiency, longevity, and lack of foreseeable technical obsolescence compared with conventional electronic media. Various challenging experiments have demonstrated that digital information can be written in DNA, stored, and accurately read. Besides, due to the digital DNA nature, there is a trend to associate the DNA information (6 bits per amino acid) with typical digital codes for information representation (8 bits). Therefore, we propose to use a series of 48 bits to encode the digital information of a host into DNA representation. This representation is appropriate in end-to-end digital communication systems since (i) it introduces a digital code regardless of the computer's architecture, and (ii) it can be used as a \\\"common format\\\" for \\\"bio host-bio transmitter\\\" with both the advantages of DNA as a storage medium and the effective methods to compress DNA information to save the transmission medium bandwidth.\",\"PeriodicalId\":93611,\"journal\":{\"name\":\"Proceedings of the 7th ACM International Conference on Nanoscale Computing and Communication : Virtual Conference, September 23-25, 2020 : NanoCom 2020. ACM International Conference on Nanoscale Computing and Communication (7th : 2020 :...\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th ACM International Conference on Nanoscale Computing and Communication : Virtual Conference, September 23-25, 2020 : NanoCom 2020. ACM International Conference on Nanoscale Computing and Communication (7th : 2020 :...\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3411295.3411314\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th ACM International Conference on Nanoscale Computing and Communication : Virtual Conference, September 23-25, 2020 : NanoCom 2020. ACM International Conference on Nanoscale Computing and Communication (7th : 2020 :...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3411295.3411314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the efficient digital code representation in DNA-based data storage
Deoxyribonucleic acid (DNA), the life molecule, is composed of four nucleotides: Adenine, Guanine, Cytosine, and Thymine. The combinations of these nucleotides in the DNA encode the 20 amino acids that generate the structure of living organisms. These discrete components, jointly with the characteristics and functions of DNA, allow understanding the DNA as a digital component. Thus, when DNA is considered an organic digital memory, it becomes a compelling data storage medium given its superior density, stability, energy efficiency, longevity, and lack of foreseeable technical obsolescence compared with conventional electronic media. Various challenging experiments have demonstrated that digital information can be written in DNA, stored, and accurately read. Besides, due to the digital DNA nature, there is a trend to associate the DNA information (6 bits per amino acid) with typical digital codes for information representation (8 bits). Therefore, we propose to use a series of 48 bits to encode the digital information of a host into DNA representation. This representation is appropriate in end-to-end digital communication systems since (i) it introduces a digital code regardless of the computer's architecture, and (ii) it can be used as a "common format" for "bio host-bio transmitter" with both the advantages of DNA as a storage medium and the effective methods to compress DNA information to save the transmission medium bandwidth.