Fabián Reyes-Prieto, Adda J García-Chéquer, Hueman Jaimes-Díaz, Janet Casique-Almazán, Juana M Espinosa-Lara, Rosaura Palma-Orozco, Alfonso Méndez-Tenorio, Rogelio Maldonado-Rodríguez, Kenneth L Beattie
{"title":"LifePrint: a novel k-tuple distance method for construction of phylogenetic trees.","authors":"Fabián Reyes-Prieto, Adda J García-Chéquer, Hueman Jaimes-Díaz, Janet Casique-Almazán, Juana M Espinosa-Lara, Rosaura Palma-Orozco, Alfonso Méndez-Tenorio, Rogelio Maldonado-Rodríguez, Kenneth L Beattie","doi":"10.2147/AABC.S15021","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Here we describe LifePrint, a sequence alignment-independent k-tuple distance method to estimate relatedness between complete genomes.</p><p><strong>Methods: </strong>We designed a representative sample of all possible DNA tuples of length 9 (9-tuples). The final sample comprises 1878 tuples (called the LifePrint set of 9-tuples; LPS9) that are distinct from each other by at least two internal and noncontiguous nucleotide differences. For validation of our k-tuple distance method, we analyzed several real and simulated viroid genomes. Using different distance metrics, we scrutinized diverse viroid genomes to estimate the k-tuple distances between these genomic sequences. Then we used the estimated genomic k-tuple distances to construct phylogenetic trees using the neighbor-joining algorithm. A comparison of the accuracy of LPS9 and the previously reported 5-tuple method was made using symmetric differences between the trees estimated from each method and a simulated \"true\" phylogenetic tree.</p><p><strong>Results: </strong>The identified optimal search scheme for LPS9 allows only up to two nucleotide differences between each 9-tuple and the scrutinized genome. Similarity search results of simulated viroid genomes indicate that, in most cases, LPS9 is able to detect single-base substitutions between genomes efficiently. Analysis of simulated genomic variants with a high proportion of base substitutions indicates that LPS9 is able to discern relationships between genomic variants with up to 40% of nucleotide substitution.</p><p><strong>Conclusion: </strong>Our LPS9 method generates more accurate phylogenetic reconstructions than the previously proposed 5-tuples strategy. LPS9-reconstructed trees show higher bootstrap proportion values than distance trees derived from the 5-tuple method.</p>","PeriodicalId":53584,"journal":{"name":"Advances and Applications in Bioinformatics and Chemistry","volume":"4 ","pages":"13-27"},"PeriodicalIF":0.0000,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2147/AABC.S15021","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances and Applications in Bioinformatics and Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2147/AABC.S15021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2011/1/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 8
Abstract
Purpose: Here we describe LifePrint, a sequence alignment-independent k-tuple distance method to estimate relatedness between complete genomes.
Methods: We designed a representative sample of all possible DNA tuples of length 9 (9-tuples). The final sample comprises 1878 tuples (called the LifePrint set of 9-tuples; LPS9) that are distinct from each other by at least two internal and noncontiguous nucleotide differences. For validation of our k-tuple distance method, we analyzed several real and simulated viroid genomes. Using different distance metrics, we scrutinized diverse viroid genomes to estimate the k-tuple distances between these genomic sequences. Then we used the estimated genomic k-tuple distances to construct phylogenetic trees using the neighbor-joining algorithm. A comparison of the accuracy of LPS9 and the previously reported 5-tuple method was made using symmetric differences between the trees estimated from each method and a simulated "true" phylogenetic tree.
Results: The identified optimal search scheme for LPS9 allows only up to two nucleotide differences between each 9-tuple and the scrutinized genome. Similarity search results of simulated viroid genomes indicate that, in most cases, LPS9 is able to detect single-base substitutions between genomes efficiently. Analysis of simulated genomic variants with a high proportion of base substitutions indicates that LPS9 is able to discern relationships between genomic variants with up to 40% of nucleotide substitution.
Conclusion: Our LPS9 method generates more accurate phylogenetic reconstructions than the previously proposed 5-tuples strategy. LPS9-reconstructed trees show higher bootstrap proportion values than distance trees derived from the 5-tuple method.