{"title":"Pruning Forests to Find the Trees","authors":"H. Jamil","doi":"10.1145/2949689.2949697","DOIUrl":null,"url":null,"abstract":"The vast majority of phylogenetic databases do not support a declarative querying platform using which their contents can be flexibly and conveniently accessed. The template based query interfaces they support do not allow arbitrary speculative queries. While a small number of graph query languages such as XQuery, Cypher and GraphQL exist for computer savvy users, most are too general and complex to be useful for biologists, and too inefficient for large phylogeny querying. In this paper, we discuss a recently introduced visual query language, called PhyQL, that leverages phylogeny specific properties to support essential and powerful constructs for a large class of phylogentic queries. Its deductive reasoner based implementation offers opportunities for a wide range of pruning strategies to speed up processing using query specific optimization and thus making it suitable for large phylogeny querying. A hybrid optimization technique that exploits a set of indices and \"graphlet\" partitioning is discussed. A \"fail soonest\" strategy is used to avoid hopeless processing and is shown to produce dividends.","PeriodicalId":254803,"journal":{"name":"Proceedings of the 28th International Conference on Scientific and Statistical Database Management","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2949689.2949697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The vast majority of phylogenetic databases do not support a declarative querying platform using which their contents can be flexibly and conveniently accessed. The template based query interfaces they support do not allow arbitrary speculative queries. While a small number of graph query languages such as XQuery, Cypher and GraphQL exist for computer savvy users, most are too general and complex to be useful for biologists, and too inefficient for large phylogeny querying. In this paper, we discuss a recently introduced visual query language, called PhyQL, that leverages phylogeny specific properties to support essential and powerful constructs for a large class of phylogentic queries. Its deductive reasoner based implementation offers opportunities for a wide range of pruning strategies to speed up processing using query specific optimization and thus making it suitable for large phylogeny querying. A hybrid optimization technique that exploits a set of indices and "graphlet" partitioning is discussed. A "fail soonest" strategy is used to avoid hopeless processing and is shown to produce dividends.