Security Provisioning and Compression of Diverse Genomic Data based on Advanced Encryption Standard (AES) Algorithm

Q4 Biochemistry, Genetics and Molecular Biology
Raveendra Gudodagi, R. Reddy
{"title":"Security Provisioning and Compression of Diverse Genomic Data based on Advanced Encryption Standard (AES) Algorithm","authors":"Raveendra Gudodagi, R. Reddy","doi":"10.46300/91011.2021.15.14","DOIUrl":null,"url":null,"abstract":"Compression of genomic data has gained enormous momentum in recent years because of advances in technology, exponentially growing health concerns, and government funding for research. Such advances have driven us to personalize public health and medical care. These pose a considerable challenge for ubiquitous computing in data storage. One of the main issues faced by genomic laboratories is the 'cost of storage' due to the large data file of the human genome (ranging from 30 GB to 200 GB). Data preservation is a set of actions meant to protect data from unauthorized access or changes. There are several methods used to protect data, and encryption is one of them. Protecting genomic data is a critical concern in genomics as it includes personal data. We suggest a secure encryption and decryption technique for diverse genomic data (FASTA / FASTQ format) in this article. Since we know the sequenced data is massive in bulk, the raw sequenced file is broken into sections and compressed. The Advanced Encryption Standard (AES) algorithm is used for encryption, and the Galois / Counter Mode (GCM) algorithm, is used to decode the encrypted data. This approach reduces the amount of storage space used for the data disc while preserving the data. This condition necessitates the use of a modern data compression strategy. That not only reduces storage but also improves process efficiency by using a k-th order Markov chain. In this regard, no efforts have been made to address this problem separately, from both the hardware and software realms. In this analysis, we support the need for a tailor-made hardware and software ecosystem that will take full advantage of the current stand-alone solutions. The paper discusses sequenced DNA, which may take the form of raw data obtained from sequencing. Inappropriate use of genomic data presents unique risks because it can be used to classify any individual; thus, the study focuses on the security provisioning and compression of diverse genomic data using the Advanced Encryption Standard (AES) Algorithm.","PeriodicalId":53488,"journal":{"name":"International Journal of Biology and Biomedical Engineering","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biology and Biomedical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46300/91011.2021.15.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 0

Abstract

Compression of genomic data has gained enormous momentum in recent years because of advances in technology, exponentially growing health concerns, and government funding for research. Such advances have driven us to personalize public health and medical care. These pose a considerable challenge for ubiquitous computing in data storage. One of the main issues faced by genomic laboratories is the 'cost of storage' due to the large data file of the human genome (ranging from 30 GB to 200 GB). Data preservation is a set of actions meant to protect data from unauthorized access or changes. There are several methods used to protect data, and encryption is one of them. Protecting genomic data is a critical concern in genomics as it includes personal data. We suggest a secure encryption and decryption technique for diverse genomic data (FASTA / FASTQ format) in this article. Since we know the sequenced data is massive in bulk, the raw sequenced file is broken into sections and compressed. The Advanced Encryption Standard (AES) algorithm is used for encryption, and the Galois / Counter Mode (GCM) algorithm, is used to decode the encrypted data. This approach reduces the amount of storage space used for the data disc while preserving the data. This condition necessitates the use of a modern data compression strategy. That not only reduces storage but also improves process efficiency by using a k-th order Markov chain. In this regard, no efforts have been made to address this problem separately, from both the hardware and software realms. In this analysis, we support the need for a tailor-made hardware and software ecosystem that will take full advantage of the current stand-alone solutions. The paper discusses sequenced DNA, which may take the form of raw data obtained from sequencing. Inappropriate use of genomic data presents unique risks because it can be used to classify any individual; thus, the study focuses on the security provisioning and compression of diverse genomic data using the Advanced Encryption Standard (AES) Algorithm.
基于高级加密标准(AES)算法的多种基因组数据的安全配置与压缩
近年来,由于技术的进步、健康问题的指数级增长以及政府对研究的资助,基因组数据的压缩获得了巨大的动力。这些进步促使我们将公共卫生和医疗保健个人化。这对数据存储中的普适计算提出了相当大的挑战。基因组实验室面临的主要问题之一是“存储成本”,因为人类基因组的数据文件很大(从30 GB到200 GB不等)。数据保存是一组旨在保护数据免受未经授权的访问或更改的操作。有几种方法用于保护数据,加密是其中之一。保护基因组数据是基因组学的一个关键问题,因为它包括个人数据。本文提出了一种针对多种基因组数据的安全加密和解密技术(FASTA / FASTQ格式)。因为我们知道已排序的数据是大量的,所以原始的已排序文件被分成几个部分并进行压缩。加密时使用AES (Advanced Encryption Standard)算法,解码时使用GCM (Galois / Counter Mode)算法。这种方法在保留数据的同时减少了用于数据磁盘的存储空间。这种情况要求使用现代数据压缩策略。这不仅减少了存储空间,而且通过使用k阶马尔可夫链提高了处理效率。在这方面,没有作出任何努力分别从硬件和软件领域解决这个问题。在此分析中,我们支持对定制的硬件和软件生态系统的需求,该生态系统将充分利用当前的独立解决方案。本文讨论了已测序的DNA,它可以采取从测序中获得的原始数据的形式。不恰当地使用基因组数据会带来独特的风险,因为它可以用来对任何个体进行分类;因此,本研究的重点是使用高级加密标准(AES)算法对各种基因组数据进行安全配置和压缩。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Biology and Biomedical Engineering
International Journal of Biology and Biomedical Engineering Biochemistry, Genetics and Molecular Biology-Biochemistry, Genetics and Molecular Biology (all)
自引率
0.00%
发文量
42
期刊介绍: Topics: Molecular Dynamics, Biochemistry, Biophysics, Quantum Chemistry, Molecular Biology, Cell Biology, Immunology, Neurophysiology, Genetics, Population Dynamics, Dynamics of Diseases, Bioecology, Epidemiology, Social Dynamics, PhotoBiology, PhotoChemistry, Plant Biology, Microbiology, Immunology, Bioinformatics, Signal Transduction, Environmental Systems, Psychological and Cognitive Systems, Pattern Formation, Evolution, Game Theory and Adaptive Dynamics, Bioengineering, Biotechnolgies, Medical Imaging, Medical Signal Processing, Feedback Control in Biology and Chemistry, Fluid Mechanics and Applications in Biomedicine, Space Medicine and Biology, Nuclear Biology and Medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信