{"title":"Gene Tree Affects Inference of Sites Under Selection by the Branch-Site Test of Positive Selection","authors":"Y. Diekmann, J. Pereira-Leal","doi":"10.4137/EBO.S30902","DOIUrl":"https://doi.org/10.4137/EBO.S30902","url":null,"abstract":"The branch-site test of positive selection is a standard approach to detect past episodic positive selection in a priori-specified branches of a gene phylogeny. Here, we ask if differences in the topology of the gene tree have any influence on the ability to infer positively selected sites. Using simulated sequences, we compare the results obtained for true and rearranged topologies. We find a strong relationship between “conflicting branch length,” which occurs when the set of sequences that experiences selection for a given topology and foreground is changed, and the ability to predict positively selected sites. Moreover, by reanalyzing a previously published data set, we show that the choice of a gene tree also affects the results obtained for real-world sequences. This is the first study to demonstrate that tree topology has a clear effect on the inference of positive selection. We conclude that the choice of a gene tree is an important factor for the branch-site analysis of positive selection.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130483105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phylotranscriptomic Analysis Based on Coalescence was Less Influenced by the Evolving Rates and the Number of Genes: A Case Study in Ericales","authors":"Luchong Zhang, Wei Wu, Haifei Yan, X. Ge","doi":"10.4137/EBO.S22448","DOIUrl":"https://doi.org/10.4137/EBO.S22448","url":null,"abstract":"Advances in high-throughput sequencing have generated a vast amount of transcriptomic data that are being increasingly used in phylogenetic reconstruction. However, processing the vast datasets for a huge number of genes and even identifying optimal analytical methodology are challenging. Through de novo sequenced and retrieved data from public databases, we identified 221 orthologous protein-coding genes to reconstruct the phylogeny of Ericales, an order characterized by rapid ancient radiation. Seven species representing different families in Ericales were used as in-groups. Both concatenation and coalescence methods yielded the same well-supported topology as previous studies, with only two nodes conflicting with previously reported relationships. The results revealed that a partitioning strategy could improve the traditional concatenation methodology. Rapidly evolving genes negatively affected the concatenation analysis, while slowly evolving genes slightly affected the coalescence analysis. The coalescence methods usually accommodated rate heterogeneity better and required fewer genes to yield well-supported topologies than the concatenation methods with both real and simulated data.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132557257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparative Genomics of Amphibian-like Ranaviruses, Nucleocytoplasmic Large DNA Viruses of Poikilotherms","authors":"S. J. Price","doi":"10.4137/EBO.S33490","DOIUrl":"https://doi.org/10.4137/EBO.S33490","url":null,"abstract":"Recent research on genome evolution of large DNA viruses has highlighted a number of incredibly dynamic processes that can facilitate rapid adaptation. The genomes of amphibian-like ranaviruses - double-stranded DNA viruses infecting amphibians, reptiles, and fish (family Iridoviridae) - were examined to assess variation in genome content and evolutionary processes. The viruses studied were closely related, but their genome content varied considerably, with 29 genes identified that were not present in all of the major clades. Twenty-one genes had evidence of recombination, while a virus isolated from a captive reptile appeared to be a mosaic of two divergent parents. Positive selection was also found to be acting on more than a quarter of Ranavirus genes and was found most frequently in the Spanish common midwife toad virus, which has had a severe impact on amphibian host communities. Efforts to resolve the root of this group by inclusion of an outgroup were inconclusive, but a set of core genes were identified, which recovered a well-supported species tree.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129515023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Currat, P. Gerbault, D. Di, J. M. Nunes, A. Sanchez‐Mazas
{"title":"Forward-in-Time, Spatially Explicit Modeling Software to Simulate Genetic Lineages Under Selection","authors":"M. Currat, P. Gerbault, D. Di, J. M. Nunes, A. Sanchez‐Mazas","doi":"10.4137/EBO.S33488","DOIUrl":"https://doi.org/10.4137/EBO.S33488","url":null,"abstract":"SELECTOR is a software package for studying the evolution of multiallelic genes under balancing or positive selection while simulating complex evolutionary scenarios that integrate demographic growth and migration in a spatially explicit population framework. Parameters can be varied both in space and time to account for geographical, environmental, and cultural heterogeneity. SELECTOR can be used within an approximate Bayesian computation estimation framework. We first describe the principles of SELECTOR and validate the algorithms by comparing its outputs for simple models with theoretical expectations. Then, we show how it can be used to investigate genetic differentiation of loci under balancing selection in interconnected demes with spatially heterogeneous gene flow. We identify situations in which balancing selection reduces genetic differentiation between population groups compared with neutrality and explain conflicting outcomes observed for human leukocyte antigen loci. These results and three previously published applications demonstrate that SELECTOR is efficient and robust for building insight into human settlement history and evolution.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124457580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Disease-Associated Coding Sequence Variation to Investigate Functional Compensation by Human Paralogous Proteins","authors":"Sayaka Miura, Stephanie Tate, Sudhir Kumar","doi":"10.4137/EBO.S30594","DOIUrl":"https://doi.org/10.4137/EBO.S30594","url":null,"abstract":"Gene duplication enables the functional diversification in species. It is thought that duplicated genes may be able to compensate if the function of one of the gene copies is disrupted. This possibility is extensively debated with some studies reporting proteome-wide compensation, whereas others suggest functional compensation among only recent gene duplicates or no compensation at all. We report results from a systematic molecular evolutionary analysis to test the predictions of the functional compensation hypothesis. We contrasted the density of Mendelian disease-associated single nucleotide variants (dSNVs) in proteins with no discernable paralogs (singletons) with the dSNV density in proteins found in multigene families. Under the functional compensation hypothesis, we expected to find greater numbers of dSNVs in singletons due to the lack of any compensating partners. Our analyses produced an opposite pattern; paralogs have over 35% higher dSNV density than singletons. We found that these patterns are concordant with similar differences in the rates of amino acid evolution (ie, functional constraints), as the proteins with paralogs have evolved 33% slower than singletons. Our evolutionary constraint explanation is robust to differences in family sizes, ages (young vs. old duplicates), and degrees of amino acid sequence similarities among paralogs. Therefore, disease-associated human variation does not exhibit significant signals of functional compensation among paralogous proteins, but rather an evolutionary constraint hypothesis provides a better explanation for the observed patterns of disease-associated and neutral polymorphisms in the human genome.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123832305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using uniformat and gene[rate] to Analyze Data with Ambiguities in Population Genetics","authors":"J. M. Nunes","doi":"10.4137/EBO.S32415","DOIUrl":"https://doi.org/10.4137/EBO.S32415","url":null,"abstract":"Some genetic systems frequently present ambiguous data that cannot be straightforwardly analyzed with common methods of population genetics. Two possibilities arise to analyze such data: one is the arbitrary simplification of the data and the other is the development of methods adapted to such ambiguous data. In this article, we present an attempt at such a development, the UNIFORMAT grammar and the GENEE[RATE] tools, highlighting the specific aspects and the adaptations required to analyze ambiguous nominal data in population genetics.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125673889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Lin, Xiaoyong Du, Sixue Peng, Liubin Yang, Yunlong Ma, Y. Gong, Shijun Li
{"title":"Discovering All Transcriptome Single-Nucleotide Polymorphisms and Scanning for Selection Signatures in Ducks (Anas platyrhynchos)","authors":"R. Lin, Xiaoyong Du, Sixue Peng, Liubin Yang, Yunlong Ma, Y. Gong, Shijun Li","doi":"10.4137/EBO.S21545","DOIUrl":"https://doi.org/10.4137/EBO.S21545","url":null,"abstract":"The duck is one of the most economically important waterfowl as a source of meat, eggs, and feathers. Characterizing the genetic variation in duck species is an important step toward linking genes or genomic regions with phenotypes. Human-driven selection during duck domestication and subsequent breed formation has likely left detectable signatures in duck genome. In this study, we employed a panel of >1.4 million single-nucleotide polymorphisms (SNPs) identified from the RNA sequencing (RNA-seq) data of 15 duck individuals. The density of the resulting SNPs is significantly positively correlated with the density of genes across the duck genome, which demonstrates that the usage of the RNA-seq data allowed us to enrich variant functional categories, such as coding exons, untranslated regions (UTRs), introns, and downstream/upstream. We performed a complete scan of selection signatures in the ducks using the composite likelihood ratio (CLR) and found 76 candidate regions of selection, many of which harbor genes related to phenotypes relevant to the function of the digestive system and fat metabolism, including TCF7L2, EIF2AK3, ELOVL2, and fatty acid-binding protein family. This study illustrates the potential of population genetic approaches for identifying genomic regions affecting domestication-related phenotypes and further helps to increase the known genetic information about this economically important animal.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115150925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Itunuoluwa Isewon, Jelili Oyelade, B. Brors, E. Adebiyi
{"title":"In Silico Gene Regulatory Network of the Maurer’s Cleft Pathway in Plasmodium falciparum","authors":"Itunuoluwa Isewon, Jelili Oyelade, B. Brors, E. Adebiyi","doi":"10.4137/EBO.S25585","DOIUrl":"https://doi.org/10.4137/EBO.S25585","url":null,"abstract":"The Maurer's clefts (MCs) are very important for the survival of Plasmodium falciparum within an infected cell as they are induced by the parasite itself in the erythrocyte for protein trafficking. The MCs form an interesting part of the parasite's biology as they shed more light on how the parasite remodels the erythrocyte leading to host pathogenesis and death. Here, we predicted and analyzed the genetic regulatory network of genes identified to belong to the MCs using regularized graphical Gaussian model. Our network shows four major activators, their corresponding target genes, and predicted binding sites. One of these master activators is the serine repeat antigen 5 (SERA5), predominantly expressed among the SERA multigene family of P. falciparum, which is one of the blood-stage malaria vaccine candidates. Our results provide more details about functional interactions and the regulation of the genes in the MCs’ pathway of P. falciparum.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123809222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segmenting the Human Genome into Isochores","authors":"P. Cozzi, L. Milanesi, G. Bernardi","doi":"10.4137/EBO.S27693","DOIUrl":"https://doi.org/10.4137/EBO.S27693","url":null,"abstract":"The human genome is a mosaic of isochores, which are long (>200 kb) DNA sequences that are fairly homogeneous in base composition and can be assigned to five families comprising 33%–59% of GC composition. Although the compartmentalized organization of the mammalian genome has been investigated for more than 40 years, no satisfactory automatic procedure for segmenting the genome into isochores is available so far. We present a critical discussion of the currently available methods and a new approach called isoSegmenter which allows segmenting the genome into isochores in a fast and completely automatic manner. This approach relies on two types of experimentally defined parameters, the compositional boundaries of isochore families and an optimal window size of 100 kb. The approach represents an improvement over the existing methods, is ideally suited for investigating long-range features of sequenced and assembled genomes, and is publicly available at https://github.com/bunop/isoSegmenter.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116332053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gisele Cristine de Souza Carrocini, L. Venâncio, C. Bonini-Domingos
{"title":"Screening of Transcription Factors Involved in Fetal Hemoglobin Regulation Using Phylogenetic Footprinting","authors":"Gisele Cristine de Souza Carrocini, L. Venâncio, C. Bonini-Domingos","doi":"10.4137/EBO.S15364","DOIUrl":"https://doi.org/10.4137/EBO.S15364","url":null,"abstract":"Fetal hemoglobin (Hb F) is an important genetic modulator of the beta-hemoglobinopathies. The regulation of Hb F levels is influenced by transcription factors. We used phylogenetic footprinting to screen transcription factors that have binding sites in HBG1 and HBG2 genes’ noncoding regions in order to know the genetic determinants of the Hb F expression. Our analysis showed 354 conserved motifs in the noncoding regions of HBG1 gene and 231 motifs in the HBG2 gene between the analyzed species. Of these motifs, 13 showed relation to Hb F regulation: cell division cycle-5 (CDC5), myeloblastosis viral oncogene homolog (c-MYB), transcription factor CP2 (TFCP2), GATA binding protein 1 (GATA-1), GATA binding protein 2 (GATA-2), nuclear factor erythroid 2 (NF-E2), nuclear transcription factor Y (NF-Y), runt-related transcription factor 1 (RUNX-1), T-cell acute lymphocytic leukemia 1 (TAL-1), YY1 transcription factor (YY1), beta protein 1 (BP1), chicken ovalbumin upstream promoter-transcription factor II (COUP-TFII), and paired box 1 (PAX-1). The last three motifs were conserved only in the noncoding regions of the HBG1 gene. The understanding of genetic elements involved in the maintenance of high Hb F levels may provide new efficient therapeutic strategies in the beta-hemoglobinopathies treatment, promoting reduction in clinical complications of these genetic disorders.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126839260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}