Marina Herrera Sarrias, Christopher W Wheat, Liam M Longo, Lars Arvestad
{"title":"Exonize:一个在带注释的基因组中发现和分类外显子重复的工具。","authors":"Marina Herrera Sarrias, Christopher W Wheat, Liam M Longo, Lars Arvestad","doi":"10.1093/bioadv/vbaf177","DOIUrl":null,"url":null,"abstract":"<p><strong>Summary: </strong>The protein-coding regions of eukaryotic genes are fragmented into exons that, like the genes within which they are situated, can be duplicated, deleted, or reorganized. Cataloging and organizing within-gene exon similarities is necessary for a systematic study of exon evolution and its consequences. To facilitate the study of exon duplications, we present Exonize, a computational tool that identifies and classifies coding exon duplications in annotated genomes. Exonize implements a graph-based framework to handle clusters of related exons resulting from repeated rounds of exon duplication. The interdependence between duplicated exons or groups of exons across transcripts is classified. By identifying duplication events between exonic and intronic regions, Exonize can detect unannotated or degenerate exons. To aid in data parsing and downstream analysis, the Python module exonize_analysis is provided. The application of Exonize to 20 eukaryote genomes identifies full-exon duplications in at least 4% of vertebrate genes, with more than 900 human genes having a full-exon duplication event.</p><p><strong>Availability and implementation: </strong>Exonize is available at https://github.com/msarrias/exonize.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf177"},"PeriodicalIF":2.8000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12343006/pdf/","citationCount":"0","resultStr":"{\"title\":\"Exonize: a tool for finding and classifying exon duplications in annotated genomes.\",\"authors\":\"Marina Herrera Sarrias, Christopher W Wheat, Liam M Longo, Lars Arvestad\",\"doi\":\"10.1093/bioadv/vbaf177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Summary: </strong>The protein-coding regions of eukaryotic genes are fragmented into exons that, like the genes within which they are situated, can be duplicated, deleted, or reorganized. Cataloging and organizing within-gene exon similarities is necessary for a systematic study of exon evolution and its consequences. To facilitate the study of exon duplications, we present Exonize, a computational tool that identifies and classifies coding exon duplications in annotated genomes. Exonize implements a graph-based framework to handle clusters of related exons resulting from repeated rounds of exon duplication. The interdependence between duplicated exons or groups of exons across transcripts is classified. By identifying duplication events between exonic and intronic regions, Exonize can detect unannotated or degenerate exons. To aid in data parsing and downstream analysis, the Python module exonize_analysis is provided. The application of Exonize to 20 eukaryote genomes identifies full-exon duplications in at least 4% of vertebrate genes, with more than 900 human genes having a full-exon duplication event.</p><p><strong>Availability and implementation: </strong>Exonize is available at https://github.com/msarrias/exonize.</p>\",\"PeriodicalId\":72368,\"journal\":{\"name\":\"Bioinformatics advances\",\"volume\":\"5 1\",\"pages\":\"vbaf177\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12343006/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics advances\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioadv/vbaf177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Exonize: a tool for finding and classifying exon duplications in annotated genomes.
Summary: The protein-coding regions of eukaryotic genes are fragmented into exons that, like the genes within which they are situated, can be duplicated, deleted, or reorganized. Cataloging and organizing within-gene exon similarities is necessary for a systematic study of exon evolution and its consequences. To facilitate the study of exon duplications, we present Exonize, a computational tool that identifies and classifies coding exon duplications in annotated genomes. Exonize implements a graph-based framework to handle clusters of related exons resulting from repeated rounds of exon duplication. The interdependence between duplicated exons or groups of exons across transcripts is classified. By identifying duplication events between exonic and intronic regions, Exonize can detect unannotated or degenerate exons. To aid in data parsing and downstream analysis, the Python module exonize_analysis is provided. The application of Exonize to 20 eukaryote genomes identifies full-exon duplications in at least 4% of vertebrate genes, with more than 900 human genes having a full-exon duplication event.
Availability and implementation: Exonize is available at https://github.com/msarrias/exonize.