{"title":"Clustering of main orthologs for multiple genomes.","authors":"Zheng Fu, Tao Jiang","doi":"10.1142/9781860948732_0022","DOIUrl":null,"url":null,"abstract":"The identification of orthologous genes shared by multiple genomes is critical for both functional and evolutionary studies in comparative genomics. While it is usually done by sequence similarity search and reconciled tree construction in practice, recently a new combinatorial approach and a high-throughput system MSOAR for ortholog identification between closely related genomes based on genome rearrangement and gene duplication have been proposed in (11). MSOAR assumes that orthologous genes correspond to each other in the most parsimonious evolutionary scenario minimizing the number of genome rearrangement and (post-speciation) gene duplication events. However, the parsimony approach used by MSOAR limits it to pairwsie genome comparisons. In this paper, we extend MSOAR to multiple (closely related) genomes and propose an ortholog clustering method, called MultiMSOAR, to infer main orthologs in multiple genomes. As a preliminary experiment, we apply MultiMSOAR to rat, mouse and human genomes, and validate our results using gene annotations and gene function classifications in the public databases. We further compare our results to the ortholog clusters predicted by MultiParanoid, which is an extension of the well-known program Inparanoid for pairwise genome comparisons. The comparison reveals that MultiMSOAR gives more detailed and accurate orthology information since it can effectively distinguish main orthologs from inparalogs.","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"16 1","pages":"195-201"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/9781860948732_0022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The identification of orthologous genes shared by multiple genomes is critical for both functional and evolutionary studies in comparative genomics. While it is usually done by sequence similarity search and reconciled tree construction in practice, recently a new combinatorial approach and a high-throughput system MSOAR for ortholog identification between closely related genomes based on genome rearrangement and gene duplication have been proposed in (11). MSOAR assumes that orthologous genes correspond to each other in the most parsimonious evolutionary scenario minimizing the number of genome rearrangement and (post-speciation) gene duplication events. However, the parsimony approach used by MSOAR limits it to pairwsie genome comparisons. In this paper, we extend MSOAR to multiple (closely related) genomes and propose an ortholog clustering method, called MultiMSOAR, to infer main orthologs in multiple genomes. As a preliminary experiment, we apply MultiMSOAR to rat, mouse and human genomes, and validate our results using gene annotations and gene function classifications in the public databases. We further compare our results to the ortholog clusters predicted by MultiParanoid, which is an extension of the well-known program Inparanoid for pairwise genome comparisons. The comparison reveals that MultiMSOAR gives more detailed and accurate orthology information since it can effectively distinguish main orthologs from inparalogs.