{"title":"A Compression Algorithm for DNA Sequences Based on R2G Techniques with Security","authors":"S. M. Hossein, P. Mohapatra, D. De","doi":"10.3923/TB.2015.93.98","DOIUrl":null,"url":null,"abstract":"A lossless compression algorithm, for genetic sequences, based on searching the exact repeat, reverse and genetic palindromes is reported. The compression results obtained in the algorithm show that the exact repeat, reverse and genetic palindromes are one of the main hidden regularities in DNA sequences. The proposed DNA sequence compression algorithm is based on repeat, reverse and genetic palindrome substring and creates online library file acting as a Look Up Table (LUT). The repeat, reverse and genetic palindrome substring is replaced by ASCII character where repeat of ASCII character start from 33-33+72, for reverse 33+73-33+73+72 and for genetic palindrome 179-179+72. It can provide the data security, by using ASCII code and on line Library file acting as a signature. The compression results obtained in the algorithm show that the exact repeat, reverse and genetic palindromes are one of the main hidden regularities in DNA sequences. The algorithm can approach a compression rate of 3.851273 bit/base.","PeriodicalId":164864,"journal":{"name":"Trends in Bioinformatics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3923/TB.2015.93.98","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
A lossless compression algorithm, for genetic sequences, based on searching the exact repeat, reverse and genetic palindromes is reported. The compression results obtained in the algorithm show that the exact repeat, reverse and genetic palindromes are one of the main hidden regularities in DNA sequences. The proposed DNA sequence compression algorithm is based on repeat, reverse and genetic palindrome substring and creates online library file acting as a Look Up Table (LUT). The repeat, reverse and genetic palindrome substring is replaced by ASCII character where repeat of ASCII character start from 33-33+72, for reverse 33+73-33+73+72 and for genetic palindrome 179-179+72. It can provide the data security, by using ASCII code and on line Library file acting as a signature. The compression results obtained in the algorithm show that the exact repeat, reverse and genetic palindromes are one of the main hidden regularities in DNA sequences. The algorithm can approach a compression rate of 3.851273 bit/base.