{"title":"XML query processing: efficiency and optimality","authors":"Radim Bača, M. Krátký","doi":"10.1145/2351476.2351478","DOIUrl":null,"url":null,"abstract":"XML (Extensible Mark-up Language) is a well established format which is often used for modeling of semi-structured data. XPath and XQuery are de facto standards among XML query languages and searching for occurrences of a twig pattern query (TPQ) in an XML document is one of their core tasks.\n There is a large number of different approaches addressing the TPQ matching problem. The aim of this article is to compare the state-of-the-art techniques and give an overview which can help to understand the relationships between different methodologies used in this area. We distinguish three main areas of a TPQ processing: (1) index data structures and XML document partitioning, (2) join algorithms, and (3) cost-based optimizations. We cover the most important techniques in each area and explain their relationships and possible combinations.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"4 1","pages":"8-13"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Database Engineering and Applications Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2351476.2351478","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
XML (Extensible Mark-up Language) is a well established format which is often used for modeling of semi-structured data. XPath and XQuery are de facto standards among XML query languages and searching for occurrences of a twig pattern query (TPQ) in an XML document is one of their core tasks.
There is a large number of different approaches addressing the TPQ matching problem. The aim of this article is to compare the state-of-the-art techniques and give an overview which can help to understand the relationships between different methodologies used in this area. We distinguish three main areas of a TPQ processing: (1) index data structures and XML document partitioning, (2) join algorithms, and (3) cost-based optimizations. We cover the most important techniques in each area and explain their relationships and possible combinations.