Nils Grimsmo, T. A. Bjørklund, Øystein Torbjørnsen
{"title":"XLeaf:带跳跃循环连接和虚拟节点的小枝评估","authors":"Nils Grimsmo, T. A. Bjørklund, Øystein Torbjørnsen","doi":"10.1109/DBKDA.2010.8","DOIUrl":null,"url":null,"abstract":"XML indexing and search has become an important topic, and twig joins are key building blocks in XML search systems. This paper describes a novel approach using a nested loop twig join algorithm, which combines several existing techniques to speed up evaluation of XML queries. We combine structural summaries, path indexing and prefix path partitioning to reduce the amount of data read by the join. This effect is amplified by only reading data for leaf query nodes, and inferring data for internal nodes from the structural summary. Skipping is used to speed up merges where query leaves have differing selectivity. Multiple access methods are implemented as materialized views instead of succinct secondary indexes for better locality. This redundancy is made affordable in terms of space by using compression in a back-end with columnar storage. We have implemented an experimental prototype, which shows a speedup of two orders of magnitude on XPath queries with value predicates, when compared to existing open source and commercial systems using a subset of the techniques. Space usage is also improved.","PeriodicalId":273177,"journal":{"name":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"XLeaf: Twig Evaluation with Skipping Loop Joins and Virtual Nodes\",\"authors\":\"Nils Grimsmo, T. A. Bjørklund, Øystein Torbjørnsen\",\"doi\":\"10.1109/DBKDA.2010.8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"XML indexing and search has become an important topic, and twig joins are key building blocks in XML search systems. This paper describes a novel approach using a nested loop twig join algorithm, which combines several existing techniques to speed up evaluation of XML queries. We combine structural summaries, path indexing and prefix path partitioning to reduce the amount of data read by the join. This effect is amplified by only reading data for leaf query nodes, and inferring data for internal nodes from the structural summary. Skipping is used to speed up merges where query leaves have differing selectivity. Multiple access methods are implemented as materialized views instead of succinct secondary indexes for better locality. This redundancy is made affordable in terms of space by using compression in a back-end with columnar storage. We have implemented an experimental prototype, which shows a speedup of two orders of magnitude on XPath queries with value predicates, when compared to existing open source and commercial systems using a subset of the techniques. Space usage is also improved.\",\"PeriodicalId\":273177,\"journal\":{\"name\":\"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DBKDA.2010.8\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DBKDA.2010.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
XLeaf: Twig Evaluation with Skipping Loop Joins and Virtual Nodes
XML indexing and search has become an important topic, and twig joins are key building blocks in XML search systems. This paper describes a novel approach using a nested loop twig join algorithm, which combines several existing techniques to speed up evaluation of XML queries. We combine structural summaries, path indexing and prefix path partitioning to reduce the amount of data read by the join. This effect is amplified by only reading data for leaf query nodes, and inferring data for internal nodes from the structural summary. Skipping is used to speed up merges where query leaves have differing selectivity. Multiple access methods are implemented as materialized views instead of succinct secondary indexes for better locality. This redundancy is made affordable in terms of space by using compression in a back-end with columnar storage. We have implemented an experimental prototype, which shows a speedup of two orders of magnitude on XPath queries with value predicates, when compared to existing open source and commercial systems using a subset of the techniques. Space usage is also improved.