{"title":"Segmenting the Human Genome into Isochores","authors":"P. Cozzi, L. Milanesi, G. Bernardi","doi":"10.4137/EBO.S27693","DOIUrl":"https://doi.org/10.4137/EBO.S27693","url":null,"abstract":"The human genome is a mosaic of isochores, which are long (>200 kb) DNA sequences that are fairly homogeneous in base composition and can be assigned to five families comprising 33%–59% of GC composition. Although the compartmentalized organization of the mammalian genome has been investigated for more than 40 years, no satisfactory automatic procedure for segmenting the genome into isochores is available so far. We present a critical discussion of the currently available methods and a new approach called isoSegmenter which allows segmenting the genome into isochores in a fast and completely automatic manner. This approach relies on two types of experimentally defined parameters, the compositional boundaries of isochore families and an optimal window size of 100 kb. The approach represents an improvement over the existing methods, is ideally suited for investigating long-range features of sequenced and assembled genomes, and is publicly available at https://github.com/bunop/isoSegmenter.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116332053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gisele Cristine de Souza Carrocini, L. Venâncio, C. Bonini-Domingos
{"title":"Screening of Transcription Factors Involved in Fetal Hemoglobin Regulation Using Phylogenetic Footprinting","authors":"Gisele Cristine de Souza Carrocini, L. Venâncio, C. Bonini-Domingos","doi":"10.4137/EBO.S15364","DOIUrl":"https://doi.org/10.4137/EBO.S15364","url":null,"abstract":"Fetal hemoglobin (Hb F) is an important genetic modulator of the beta-hemoglobinopathies. The regulation of Hb F levels is influenced by transcription factors. We used phylogenetic footprinting to screen transcription factors that have binding sites in HBG1 and HBG2 genes’ noncoding regions in order to know the genetic determinants of the Hb F expression. Our analysis showed 354 conserved motifs in the noncoding regions of HBG1 gene and 231 motifs in the HBG2 gene between the analyzed species. Of these motifs, 13 showed relation to Hb F regulation: cell division cycle-5 (CDC5), myeloblastosis viral oncogene homolog (c-MYB), transcription factor CP2 (TFCP2), GATA binding protein 1 (GATA-1), GATA binding protein 2 (GATA-2), nuclear factor erythroid 2 (NF-E2), nuclear transcription factor Y (NF-Y), runt-related transcription factor 1 (RUNX-1), T-cell acute lymphocytic leukemia 1 (TAL-1), YY1 transcription factor (YY1), beta protein 1 (BP1), chicken ovalbumin upstream promoter-transcription factor II (COUP-TFII), and paired box 1 (PAX-1). The last three motifs were conserved only in the noncoding regions of the HBG1 gene. The understanding of genetic elements involved in the maintenance of high Hb F levels may provide new efficient therapeutic strategies in the beta-hemoglobinopathies treatment, promoting reduction in clinical complications of these genetic disorders.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126839260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kirsten E. Beattie, Luna De Ferrari, John B. O. Mitchell
{"title":"Why do Sequence Signatures Predict Enzyme Mechanism? Homology versus Chemistry","authors":"Kirsten E. Beattie, Luna De Ferrari, John B. O. Mitchell","doi":"10.4137/EBO.S31482","DOIUrl":"https://doi.org/10.4137/EBO.S31482","url":null,"abstract":"First, we identify InterPro sequence signatures representing evolutionary relatedness and, second, signatures identifying specific chemical machinery. Thus, we predict the chemical mechanisms of enzyme-catalyzed reactions from catalytic and non-catalytic subsets of InterPro signatures. We first scanned our 249 sequences using InterProScan and then used the MACiE database to identify those amino acid residues that are important for catalysis. The sequences were mutated in silico to replace these catalytic residues with glycine and then again scanned using InterProScan. Those signature matches from the original scan that disappeared on mutation were called catalytic. Mechanism was predicted using all signatures, only the 78 “catalytic” signatures, or only the 519 “non-catalytic” signatures. The non-catalytic signatures gave indistinguishable results from those for the whole feature set, with precision of 0.991 and sensitivity of 0.970. The catalytic signatures alone gave less impressive predictivity, with precision and sensitivity of 0.791 and 0.735, respectively. These results show that our successful prediction of enzyme mechanism is mostly by homology rather than by identifying catalytic machinery.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131036646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Gambin, M. Startek, K. Walczak, Jarosław Paszek, D. Grzebelus, A. Gambin
{"title":"TIRfinder: A Web Tool for Mining Class II Transposons Carrying Terminal Inverted Repeats","authors":"T. Gambin, M. Startek, K. Walczak, Jarosław Paszek, D. Grzebelus, A. Gambin","doi":"10.4137/EBO.S10619","DOIUrl":"https://doi.org/10.4137/EBO.S10619","url":null,"abstract":"Transposable elements (TEs) can be found in virtually all known genomes; plant genomes are exceptionally rich in this kind of dispersed repetitive sequences. Current knowledge on TE proliferation dynamics places them among the main forces of molecular evolution. Therefore efficient tools to analyze TE distribution in genomes are needed that would allow for comparative genomics studies and for studying TE dynamics in a genome. This was our main motivation underpinning TIRfinder construction–-an efficient tool for mining class II TEs carrying terminal inverted repeats. TIRfinder takes as an input a genomic sequence and information on structural properties of a TE family, and identifies all TEs in the genome showing the desired structural characteristics. The efficiency and small memory requirements of our approach stem from the use of suffix trees to identify all DNA segments surrounded by user-specified terminal inverse repeats (TIR) and target site duplications (TSD) which together constitute a mask. On the other hand, the flexibility of the notion of the TIR/TSD mask makes it possible to use the tool for de novo detection. The main advantages of TIRfinder are its speed, accuracy and convenience of use for biologists. A web-based interface is freely available at http:/bioputer.mimuw.edu.pl/tirfindertool/.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130715145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TreeCmp: Comparison of Trees in Polynomial Time","authors":"D. Bogdanowicz, K. Giaro, B. Wróbel","doi":"10.4137/EBO.S9657","DOIUrl":"https://doi.org/10.4137/EBO.S9657","url":null,"abstract":"When a phylogenetic reconstruction does not result in one tree but in several, tree metrics permit finding out how far the reconstructed trees are from one another. They also permit to assess the accuracy of a reconstruction if a true tree is known. TreeCmp implements eight metrics that can be calculated in polynomial time for arbitrary (not only bifurcating) trees: four for unrooted (Matching Split metric, which we have recently proposed, Robinson-Foulds, Path Difference, Quartet) and four for rooted trees (Matching Cluster, Robinson-Foulds cluster, Nodal Splitted and Triple). TreeCmp is the first implementation of Matching Split/Cluster metrics and the first efficient and convenient implementation of Nodal Splitted. It allows to compare relatively large trees. We provide an example of the application of TreeCmp to compare the accuracy of ten approaches to phylogenetic reconstruction with trees up to 5000 external nodes, using a measure of accuracy based on normalized similarity between trees.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130886759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Lei, Thaine W. Rowley, Lifeng Zhu, Carolyn A. Bailey, Shannon E. Engberg, M. L. Wood, M. Christman, G. Perry, E. Louis, G. Lu
{"title":"PhyloMarker—A Tool for Mining Phylogenetic Markers Through Genome Comparison: Application of the Mouse Lemur (Genus Microcebus) Phylogeny","authors":"R. Lei, Thaine W. Rowley, Lifeng Zhu, Carolyn A. Bailey, Shannon E. Engberg, M. L. Wood, M. Christman, G. Perry, E. Louis, G. Lu","doi":"10.4137/EBO.S9886","DOIUrl":"https://doi.org/10.4137/EBO.S9886","url":null,"abstract":"Molecular phylogeny is a fundamental tool to understanding the evolution of all life forms. One common issue faced by molecular phylogeny is the lack of sufficient molecular markers. Here, we present PhyloMarker, a phylogenomic tool designed to find nuclear gene markers for the inference of phylogeny through multiple genome comparison. Around 800 candidate markers were identified by PhyloMarker through comparison of partial genomes of Microcebus and Otolemur. In experimental tests of 20 randomly selected markers, nine markers were successfully amplified by PCR and directly sequenced in all 17 nominal Microcebus species. Phylogenetic analyses of the sequence data obtained for 17 taxa and nine markers confirmed the distinct lineage inferred from previous mtDNA data. PhyloMarker has also been used by other projects including the herons (Ardeidae, Aves) phylogeny and the Wood mice (Muridae, Mammalia) phylogeny. All source code and sample data are made available at http://bioinfo-srv1.awh.unomaha.edu/phylomarker/.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126045486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"easyPAC: A Tool for Fast Prediction, Testing and Reference Mapping of Degenerate PCR Primers from Alignments or Consensus Sequences","authors":"David Rosenkranz","doi":"10.4137/EBO.S8870","DOIUrl":"https://doi.org/10.4137/EBO.S8870","url":null,"abstract":"The PCR-amplification of unknown homologous or paralogous genes generally relies on PCR primers predicted from multi sequence alignments. But increasing sequence divergence can induce the need to use degenerate primers which entails the problem of testing the characteristics, unwanted interactions and potential mispriming of degenerate primers. Here I introduce easyPAC, a new software for the prediction of degenerate primers from multi sequence alignments or single consensus sequences. As a major innovation, easyPAC allows to apply all customary primer test procedures to degenerate primer sequences including fast mapping to reference files. Thus, easyPAC simplifies and expedites the designing of specific degenerate primers enormously. Degenerate primers suggested by easyPAC were used in PCR amplification with subsequent de novo sequencing of TDRD1 exon 11 homologs from several representatives of the haplorrhine primate phylogeny. The results demonstrate the efficient performance of the suggested primers and therefore show that easyPAC can advance upcoming comparative genetic studies.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114685961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher Bun, William Ziccardi, J. Doering, C. Putonti
{"title":"MiIP: The Monomer Identification and Isolation Program","authors":"Christopher Bun, William Ziccardi, J. Doering, C. Putonti","doi":"10.4137/EBO.S9248","DOIUrl":"https://doi.org/10.4137/EBO.S9248","url":null,"abstract":"Repetitive elements within genomic DNA are both functionally and evolutionarily informative. Discovering these sequences ab initio is computationally challenging, compounded by the fact that selection on these repeats is often relaxed; thus sequence identity between repetitive elements can vary significantly. Here we present a new application, the Monomer Identification and Isolation Program (MilP), which provides functionality to both search for a particular repeat as well as discover repetitive elements within a larger genomic sequence. To compare MilP's performance with other repeat detection tools, analysis was conducted for synthetic sequences as well as several α21-II clones and HC21 BAC sequences. The primary benefit of MilP is the fact that it is a single tool capable of searching for both known monomelic sequences as well as discovering the occurrence of repeats ab initio, per the user's required sensitivity of the search. Furthermore, the report functionality helps easily facilitate subsequent phylogenetic analysis.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133006434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Abarenkov, L. Tedersoo, R. Nilsson, K. Vellak, I. Saar, Vilmar Veldre, É. Parmasto, M. Prous, Anne Aan, Margus Ots, O. Kurina, I. Ostonen, J. Jõgeva, Siim Halapuu, K. Põldmaa, M. Toots, J. Truu, K. Larsson, U. Kõljalg
{"title":"PlutoF—a Web Based Workbench for Ecological and Taxonomic Research, with an Online Implementation for Fungal ITS Sequences","authors":"K. Abarenkov, L. Tedersoo, R. Nilsson, K. Vellak, I. Saar, Vilmar Veldre, É. Parmasto, M. Prous, Anne Aan, Margus Ots, O. Kurina, I. Ostonen, J. Jõgeva, Siim Halapuu, K. Põldmaa, M. Toots, J. Truu, K. Larsson, U. Kõljalg","doi":"10.4137/EBO.S6271","DOIUrl":"https://doi.org/10.4137/EBO.S6271","url":null,"abstract":"DNA sequences accumulating in the International Nucleotide Sequence Databases (INSD) form a rich source of information for taxonomic and ecological meta-analyses. However, these databases include many erroneous entries, and the data itself is poorly annotated with metadata, making it difficult to target and extract entries of interest with any degree of precision. Here we describe the web-based workbench PlutoF, which is designed to bridge the gap between the needs of contemporary research in biology and the existing software resources and databases. Built on a relational database, PlutoF allows remote-access rapid submission, retrieval, and analysis of study, specimen, and sequence data in INSD as well as for private datasets though web-based thin clients. In contrast to INSD, PlutoF supports internationally standardized terminology to allow very specific annotation and linking of interacting specimens and species. The sequence analysis module is optimized for identification and analysis of environmental ITS sequences of fungi, but it can be modified to operate on any genetic marker and group of organisms. The workbench is available at http://plutof.ut.ee.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125397426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evolution of the Influenza A Virus: Some New Advances","authors":"R. Rabadán, H. Robins","doi":"10.4137/EBO.S328","DOIUrl":"https://doi.org/10.4137/EBO.S328","url":null,"abstract":"Influenza is an RNA virus that causes mild to severe respiratory symptoms in humans and other hosts. Every year approximately half a million people around the world die from seasonal Influenza. But this number is substantially larger in the case of pandemics, with the most dramatic instance being the 1918 “Spanish flu” that killed more than 50 million people worldwide. In the last few years, thousands of Influenza genomic sequences have become publicly available, including the 1918 pandemic strain and many isolates from non-human hosts. Using these data and developing adequate bioinformatic and statistical tools, some of the major questions surrounding Influenza evolution are becoming tractable. Are the mutations and reassortments random? What are the patterns behind the virus’s evolution? What are the necessary and sufficient conditions for a virus adapted to one host to infect a different host? Why is Influenza seasonal? In this review, we summarize some of the recent progress in understanding the evolution of the virus.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"10 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126700382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}