通过GeneBase鉴定最小真核内含子，这是一个用户友好的工具，用于解析NCBI基因数据库

DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes Pub Date : 2015-11-17 DOI:10.1093/dnares/dsv028

A. Piovesan, M. Caracausi, Marco Ricci, P. Strippoli, L. Vitale, M. C. Pelleri

{"title":"通过GeneBase鉴定最小真核内含子，这是一个用户友好的工具，用于解析NCBI基因数据库","authors":"A. Piovesan, M. Caracausi, Marco Ricci, P. Strippoli, L. Vitale, M. C. Pelleri","doi":"10.1093/dnares/dsv028","DOIUrl":null,"url":null,"abstract":"We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.","PeriodicalId":11212,"journal":{"name":"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes","volume":"52 1","pages":"495 - 503"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":"{\"title\":\"Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank\",\"authors\":\"A. Piovesan, M. Caracausi, Marco Ricci, P. Strippoli, L. Vitale, M. C. Pelleri\",\"doi\":\"10.1093/dnares/dsv028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.\",\"PeriodicalId\":11212,\"journal\":{\"name\":\"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes\",\"volume\":\"52 1\",\"pages\":\"495 - 503\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/dnares/dsv028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/dnares/dsv028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

摘要

我们开发了GeneBase，这是国家生物技术信息中心(NCBI)基因数据库的完整解析器，它为个人计算机生成了一个具有直观用户友好图形界面的完全结构化的本地数据库。所有注释真核生物基因的特征都可以通过三个主要的软件表访问，包括每个条目的详细信息，如基因摘要、基因外显子/内含子结构和特定的基因本体属性。数据的结构化、附加计算字段的创建以及与核苷酸序列的集成允许用户进行多种类型的比较和计算，这些比较和计算对数据检索和分析非常有用。我们提供了所有可用物种中现有内含子的原始示例分析，通过该分析，“最小内含子”的经典生物学问题可能会使用可用数据找到解决方案。根据目前所有可用的数据，我们可以定义已知最短的真核生物GT-AG内含子长度，将物理限制在属于人类MST1L基因的30碱基对内含子。这个“模型内含子”将阐明用于传统拼接功能的识别的最低要求元素。值得注意的是，这个大小确实与剪接共识序列长度的总和是一致的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank

We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes

自引率

0.00%

发文量