A. Piovesan, M. Caracausi, Marco Ricci, P. Strippoli, L. Vitale, M. C. Pelleri
{"title":"通过GeneBase鉴定最小真核内含子,这是一个用户友好的工具,用于解析NCBI基因数据库","authors":"A. Piovesan, M. Caracausi, Marco Ricci, P. Strippoli, L. Vitale, M. C. Pelleri","doi":"10.1093/dnares/dsv028","DOIUrl":null,"url":null,"abstract":"We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.","PeriodicalId":11212,"journal":{"name":"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes","volume":"52 1","pages":"495 - 503"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":"{\"title\":\"Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank\",\"authors\":\"A. Piovesan, M. Caracausi, Marco Ricci, P. Strippoli, L. Vitale, M. C. Pelleri\",\"doi\":\"10.1093/dnares/dsv028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.\",\"PeriodicalId\":11212,\"journal\":{\"name\":\"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes\",\"volume\":\"52 1\",\"pages\":\"495 - 503\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/dnares/dsv028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/dnares/dsv028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank
We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.