{"title":"Selected String Representation for Whole Genomes","authors":"Xiaomeng Wu, Guohui Lin","doi":"10.1109/CIBCB.2005.1594905","DOIUrl":null,"url":null,"abstract":"The increase in the amount of available genomic data has made phylogenetic analysis possible at the whole genome scale. However, such a huge amount of data imposes computational challenges in both memory consumption and CPU usage. One novel proposal in this paper is to extract sequence patterns that are biologically meaningful. Using these patterns, whole genomes can be mapped into a significantly lower dimensional space and subsequent studies using these representations become computationally feasible. Experiments on two datasets of 64 vertebrate mitochondrial genomes and 99 prokaryote whole genomes demonstrate that the selected sequence patterns result in good quality evolutionary distances in terms of the final phylogeny.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2005.1594905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The increase in the amount of available genomic data has made phylogenetic analysis possible at the whole genome scale. However, such a huge amount of data imposes computational challenges in both memory consumption and CPU usage. One novel proposal in this paper is to extract sequence patterns that are biologically meaningful. Using these patterns, whole genomes can be mapped into a significantly lower dimensional space and subsequent studies using these representations become computationally feasible. Experiments on two datasets of 64 vertebrate mitochondrial genomes and 99 prokaryote whole genomes demonstrate that the selected sequence patterns result in good quality evolutionary distances in terms of the final phylogeny.