M. Saqib Nawaz , M. Zohaib Nawaz , Yongshun Gong , Philippe Fournier-Viger , Abdoulaye Baniré Diallo
{"title":"In silico framework for genome analysis","authors":"M. Saqib Nawaz , M. Zohaib Nawaz , Yongshun Gong , Philippe Fournier-Viger , Abdoulaye Baniré Diallo","doi":"10.1016/j.future.2024.107585","DOIUrl":null,"url":null,"abstract":"<div><div>Genomes hold the complete genetic information of an organism. Examining and analyzing genomic data plays a critical role in properly understanding an organism, particularly the main characteristics, functionalities, and evolving nature of harmful viruses. However, the rapid increase in genomic data poses new challenges and demands for extracting meaningful and valuable insights from large and complex genomic datasets. In this paper, a novel Framework for Genome Data Analysis (F4GDA), is developed that offers various methods for the analysis of viral genomic data in various forms. The framework’s methods can not only analyze the changes in genomes but also various genome contents. As a case study, the genomes of five SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) VoC (variants of concern), which are divided into three types/groups on the basis of geographical locations, are analyzed using this framework to investigate (1) the nucleotides, amino acids and synonymous codon changes in the whole genomes of VoC as well as in the Spike (S) protein, (2) whether different environments affect the rate of changes in genomes, (3) the variations in nucleotide bases, amino acids, and codon base compositions in VoC genomes, and (4) to compare VoC genomes with the reference genome sequence of SARS-CoV-2.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"164 ","pages":"Article 107585"},"PeriodicalIF":6.2000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24005491","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Genomes hold the complete genetic information of an organism. Examining and analyzing genomic data plays a critical role in properly understanding an organism, particularly the main characteristics, functionalities, and evolving nature of harmful viruses. However, the rapid increase in genomic data poses new challenges and demands for extracting meaningful and valuable insights from large and complex genomic datasets. In this paper, a novel Framework for Genome Data Analysis (F4GDA), is developed that offers various methods for the analysis of viral genomic data in various forms. The framework’s methods can not only analyze the changes in genomes but also various genome contents. As a case study, the genomes of five SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) VoC (variants of concern), which are divided into three types/groups on the basis of geographical locations, are analyzed using this framework to investigate (1) the nucleotides, amino acids and synonymous codon changes in the whole genomes of VoC as well as in the Spike (S) protein, (2) whether different environments affect the rate of changes in genomes, (3) the variations in nucleotide bases, amino acids, and codon base compositions in VoC genomes, and (4) to compare VoC genomes with the reference genome sequence of SARS-CoV-2.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.