{"title":"Colorful orthology clustering in bounded-degree similarity graphs.","authors":"Alitzel López Sánchez, Manuel Lafond","doi":"10.1142/S0219720021400102","DOIUrl":"https://doi.org/10.1142/S0219720021400102","url":null,"abstract":"<p><p>Clustering genes in similarity graphs is a popular approach for orthology prediction. Most algorithms group genes without considering their species, which results in clusters that contain several paralogous genes. Moreover, clustering is known to be problematic when in-paralogs arise from ancient duplications. Recently, we proposed a two-step process that avoids these problems. First, we infer clusters of only orthologs (i.e. with only genes from distinct species), and second, we infer the missing inter-cluster orthologs. In this paper, we focus on the first step, which leads to a problem we call Colorful Clustering. In general, this is as hard as classical clustering. However, in similarity graphs, the number of species is usually small, as well as the neighborhood size of genes in other species. We therefore study the problem of clustering in which the number of colors is bounded by [Formula: see text], and each gene has at most [Formula: see text] neighbors in another species. We show that the well-known <i>cluster editing</i> formulation remains NP-hard even when [Formula: see text] and [Formula: see text]. We then propose a fixed-parameter algorithm in [Formula: see text] to find the single best cluster in the graph. We implemented this algorithm and included it in the aforementioned two-step approach. Experiments on simulated data show that this approach performs favorably to applying only an unconstrained clustering step.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"19 6","pages":"2140010"},"PeriodicalIF":1.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39889213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Involving repetitive regions in scaffolding improvement.","authors":"Quentin Delorme, Rémy Costa, Yasmine Mansour, Anna-Sophie Fiston-Lavier, Annie Chateau","doi":"10.1142/S0219720021400163","DOIUrl":"https://doi.org/10.1142/S0219720021400163","url":null,"abstract":"<p><p>In this paper, we investigate througth a premilinary study the influence of repeat elements during the assembly process. We analyze the link between the presence and the nature of one type of repeat element, called transposable element (TE) and misassembly events in genome assemblies. We propose to improve assemblies by taking into account the presence of repeat elements, including TEs, during the scaffolding step. We analyze the results and relate the misassemblies to TEs before and after correction.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"19 6","pages":"2140016"},"PeriodicalIF":1.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39614909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BOPAL 2.0 and a study of tRNA and rRNA gene evolution in <i>Clostridium</i>.","authors":"Meghan Chua, Anthony Tan, Olivier Tremblay-Savard","doi":"10.1142/S0219720021400072","DOIUrl":"https://doi.org/10.1142/S0219720021400072","url":null,"abstract":"<p><p>We present BOPAL 2.0, an improved version of the BOPAL algorithm for the evolutionary history inference of tRNA and rRNA genes in bacterial genomes. Our approach can infer complete evolutionary scenarios and ancestral gene orders on a phylogeny and considers a wide range of events such as duplications, deletions, substitutions, inversions and transpositions. It is based on the fact that tRNA and rRNA genes are often organized in operons/clusters in bacteria, and this information is used to help identify orthologous genes for each genome comparison. BOPAL 2.0 introduces new features, such as a triple-wise alignment step, context-aware singleton matching and a second pass of the algorithm. Evaluation on simulated datasets shows that BOPAL 2.0 outperforms the original BOPAL in terms of the accuracy of inferred events and ancestral genomes. We also present a study of the tRNA/rRNA gene evolution in the <i>Clostridium</i> genus, in which the organization of these genes is very divergent. Our results indicate that tRNA and rRNA genes in <i>Clostridium</i> have evolved through numerous duplications, losses, transpositions and substitutions, but very few inversions were inferred.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"19 6","pages":"2140007"},"PeriodicalIF":1.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39889207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena S Gusareva, Paolo Alberto Lorenzini, Nurul Adilah Binte Ramli, Amit Gourav Ghosh, Hie Lim Kim
{"title":"Population-specific adaptation in malaria-endemic regions of asia.","authors":"Elena S Gusareva, Paolo Alberto Lorenzini, Nurul Adilah Binte Ramli, Amit Gourav Ghosh, Hie Lim Kim","doi":"10.1142/S0219720021400060","DOIUrl":"https://doi.org/10.1142/S0219720021400060","url":null,"abstract":"<p><p>Evolutionary mechanisms of adaptation to malaria are understudied in Asian endemic regions despite a high prevalence of malaria in the region. In our research, we performed a genome-wide screening for footprints of natural selection against malaria by comparing eight Asian population groups from malaria-endemic regions with two non-endemic population groups from Europe and Mongolia. We identified 285 adaptive genes showing robust selection signals across three statistical methods, iHS, XP-EHH, and PBS. Interestingly, most of the identified genes (82%) were found to be under selection in a single population group, while adaptive genes shared across populations were rare. This is likely due to the independent adaptation history in different endemic populations. The gene ontology (GO) analysis for the 285 adaptive genes highlighted their functional processes linked to neuronal organizations or nervous system development. These genes could be related to cerebral malaria and may reduce the inflammatory response and the severity of malaria symptoms. Remarkably, our novel population genomic approach identified population-specific adaptive genes potentially against malaria infection without the need for patient samples or individual medical records.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"19 6","pages":"2140006"},"PeriodicalIF":1.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39692975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evidence for exon shuffling is sensitive to model choice.","authors":"Xiaoyue Cui, Maureen Stolzer, Dannie Durand","doi":"10.1142/S0219720021400138","DOIUrl":"https://doi.org/10.1142/S0219720021400138","url":null,"abstract":"<p><p>The exon shuffling theory posits that intronic recombination creates new domain combinations, facilitating the evolution of novel protein function. This theory predicts that introns will be preferentially situated near domain boundaries. Many studies have sought evidence for exon shuffling by testing the correspondence between introns and domain boundaries against chance intron positioning. Here, we present an empirical investigation of how the choice of null model influences significance. Although genome-wide studies have used a uniform null model, exclusively, more realistic null models have been proposed for single gene studies. We extended these models for genome-wide analyses and applied them to 21 metazoan and fungal genomes. Our results show that compared with the other two models, the uniform model does not recapitulate genuine exon lengths, dramatically underestimates the probability of chance agreement, and overestimates the significance of intron-domain correspondence by as much as 100 orders of magnitude. Model choice had much greater impact on the assessment of exon shuffling in fungal genomes than in metazoa, leading to different evolutionary conclusions in seven of the 16 fungal genomes tested. Genome-wide studies that use this overly permissive null model may exaggerate the importance of exon shuffling as a general mechanism of multidomain evolution.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"19 6","pages":"2140013"},"PeriodicalIF":1.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39645906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sequence Alignment","authors":"B. Tiwary","doi":"10.1007/978-981-16-4241-8_4","DOIUrl":"https://doi.org/10.1007/978-981-16-4241-8_4","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"79 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80341936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Systems Biology","authors":"B. Tiwary","doi":"10.1002/9783527809080.cataz16171","DOIUrl":"https://doi.org/10.1002/9783527809080.cataz16171","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"118 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88037792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introduction to Bioinformatics and Computational Biology","authors":"B. Tiwary","doi":"10.1007/978-981-16-4241-8_1","DOIUrl":"https://doi.org/10.1007/978-981-16-4241-8_1","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"30 3 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76947480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Next-Generation Sequencing","authors":"B. Tiwary","doi":"10.1007/978-981-16-4241-8_7","DOIUrl":"https://doi.org/10.1007/978-981-16-4241-8_7","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"5 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75620007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}