William R. Pearson, Aaron J. Mackey
下载PDF
{"title":"基于SQL数据库的序列相似性搜索与分析","authors":"William R. Pearson, Aaron J. Mackey","doi":"10.1002/cpbi.32","DOIUrl":null,"url":null,"abstract":"<p>Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, <span>seqdb_demo</span>, which is used as a basis for the other protocols. The unit also introduces <span>search_demo</span>, a database that stores sequence similarity search results. The <span>search_demo</span> database is then used to explore the evolutionary relationships between <i>E. coli</i> proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc.</p>","PeriodicalId":10958,"journal":{"name":"Current protocols in bioinformatics","volume":"59 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/cpbi.32","citationCount":"6","resultStr":"{\"title\":\"Using SQL Databases for Sequence Similarity Searching and Analysis\",\"authors\":\"William R. Pearson, Aaron J. Mackey\",\"doi\":\"10.1002/cpbi.32\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, <span>seqdb_demo</span>, which is used as a basis for the other protocols. The unit also introduces <span>search_demo</span>, a database that stores sequence similarity search results. The <span>search_demo</span> database is then used to explore the evolutionary relationships between <i>E. coli</i> proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc.</p>\",\"PeriodicalId\":10958,\"journal\":{\"name\":\"Current protocols in bioinformatics\",\"volume\":\"59 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1002/cpbi.32\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current protocols in bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cpbi.32\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Biochemistry, Genetics and Molecular Biology\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current protocols in bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpbi.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 6
引用
批量引用