Meta-data indexing for XPath location steps

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI:10.1145/1142473.1142525

SungRan Cho, Nick Koudas, D. Srivastava

{"title":"Meta-data indexing for XPath location steps","authors":"SungRan Cho, Nick Koudas, D. Srivastava","doi":"10.1145/1142473.1142525","DOIUrl":null,"url":null,"abstract":"XML is the de facto standard for data representation and exchange over the Web. Given the diversity of the information available in XML, it is very useful to annotate XML data with a wide variety of meta-data, such as quality and sensitivity. When querying such XML data, say using XPath, it is important to efficiently identify the data that meet specified constraints on the meta-data. For example, different users may be satisfied with different levels of quality guarantees, or may only have access to different parts of the XML data based on specified security policies. In this paper, we address the problem of efficiently identifying the XML elements along a location step in an XPath query, that satisfy meta-data range constraints, when the meta-data levels are specifically drawn from an ordered domain (e.g., accuracy in [0,1], recency using timestamps, multi-level security, etc.). More specifically, we develop a family of index structures, which we refer to as meta-data indexes, to address this problem. A meta-data index is easily instantiated using a multi-dimensional index structure, such as an R-tree, incorporating novel query and update algorithms. We show that the full meta-data index (FMI), based on associating each XML element with its meta-data level, has a very high update cost for modifying an element's meta-data level. We resolve this problem by designing the inheritance meta-data index (IMI), in which (i) actual meta-data levels are associated only with elements for which this value is explicitly specified, and (ii) inherited meta-data levels and inheritance source nodes are associated with non-leaf nodes of the index structure. We design efficient query (for all XPath axes) and update (of meta-data levels) algorithms for the IMI, and experimentally demonstrate the superiority of the IMI over the FMI using benchmark data sets.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"154 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1142473.1142525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

XML is the de facto standard for data representation and exchange over the Web. Given the diversity of the information available in XML, it is very useful to annotate XML data with a wide variety of meta-data, such as quality and sensitivity. When querying such XML data, say using XPath, it is important to efficiently identify the data that meet specified constraints on the meta-data. For example, different users may be satisfied with different levels of quality guarantees, or may only have access to different parts of the XML data based on specified security policies. In this paper, we address the problem of efficiently identifying the XML elements along a location step in an XPath query, that satisfy meta-data range constraints, when the meta-data levels are specifically drawn from an ordered domain (e.g., accuracy in [0,1], recency using timestamps, multi-level security, etc.). More specifically, we develop a family of index structures, which we refer to as meta-data indexes, to address this problem. A meta-data index is easily instantiated using a multi-dimensional index structure, such as an R-tree, incorporating novel query and update algorithms. We show that the full meta-data index (FMI), based on associating each XML element with its meta-data level, has a very high update cost for modifying an element's meta-data level. We resolve this problem by designing the inheritance meta-data index (IMI), in which (i) actual meta-data levels are associated only with elements for which this value is explicitly specified, and (ii) inherited meta-data levels and inheritance source nodes are associated with non-leaf nodes of the index structure. We design efficient query (for all XPath axes) and update (of meta-data levels) algorithms for the IMI, and experimentally demonstrate the superiority of the IMI over the FMI using benchmark data sets.

查看原文本刊更多论文

用于XPath定位步骤的元数据索引

XML是Web上数据表示和交换的事实上的标准。考虑到XML中可用信息的多样性，用各种各样的元数据(如质量和灵敏度)注释XML数据非常有用。在查询这样的XML数据(比如使用XPath)时，重要的是要有效地识别满足元数据上指定约束的数据。例如，不同的用户可能对不同级别的质量保证感到满意，或者可能只能基于指定的安全策略访问XML数据的不同部分。在本文中，我们解决了当元数据级别是从有序域(例如，[0,1]中的准确性、使用时间戳的近时性、多级安全性等)中明确绘制的元数据级别时，沿着XPath查询中的位置步骤有效识别满足元数据范围约束的XML元素的问题。更具体地说，我们开发了一系列索引结构(我们称之为元数据索引)来解决这个问题。元数据索引可以使用多维索引结构(如r树)轻松实例化，并结合新颖的查询和更新算法。我们展示了基于将每个XML元素与其元数据级别相关联的完整元数据索引(FMI)在修改元素的元数据级别时具有非常高的更新成本。我们通过设计继承元数据索引(IMI)来解决这个问题，其中(i)实际元数据级别仅与明确指定此值的元素相关联，(ii)继承元数据级别和继承源节点与索引结构的非叶节点相关联。我们为IMI设计了高效的查询(针对所有XPath轴)和更新(元数据级别)算法，并使用基准数据集实验证明了IMI优于FMI。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2006 ACM SIGMOD international conference on Management of data

自引率

0.00%

发文量