查询XML文档变得容易:最近的概念查询

Proceedings 17th International Conference on Data Engineering Pub Date : 2001-04-02 DOI:10.1109/ICDE.2001.914844

A. Schmidt, M. Kersten, Menzo Windhouwer

{"title":"查询XML文档变得容易:最近的概念查询","authors":"A. Schmidt, M. Kersten, Menzo Windhouwer","doi":"10.1109/ICDE.2001.914844","DOIUrl":null,"url":null,"abstract":"Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). None the less, it is this hierarchical structure that forms the basis of XML query languages. We exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: e.g., given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author's name and a year mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept. We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"126","resultStr":"{\"title\":\"Querying XML documents made easy: nearest concept queries\",\"authors\":\"A. Schmidt, M. Kersten, Menzo Windhouwer\",\"doi\":\"10.1109/ICDE.2001.914844\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). None the less, it is this hierarchical structure that forms the basis of XML query languages. We exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: e.g., given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author's name and a year mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept. We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently.\",\"PeriodicalId\":431818,\"journal\":{\"name\":\"Proceedings 17th International Conference on Data Engineering\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"126\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 17th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2001.914844\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 17th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2001.914844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 126

摘要

由于XML的普遍性和流行性，用户经常遇到以下情况:他们希望查询包含潜在有趣信息的XML文档，但他们不知道所使用的标记结构。例如，很容易猜测XML书目文件的内容，而标记则取决于作者的方法、文化和个人背景。然而，正是这种层次结构构成了XML查询语言的基础。我们利用XML文档的树状结构为用户提供了一个强大的工具，即meet操作符，它允许用户查询他们熟悉的内容的数据库，而不需要了解标记和层次结构。我们的方法是基于计算XML语法树中节点的最低共同祖先:例如，给定两个字符串，我们正在寻找其后代包含这两个字符串的节点。这种方法的新颖之处在于，结果类型在查询制定时是未知的，并且依赖于数据库实例。如果这两个字符串是作者的名字和年份，则主要返回作者在这一年的出版物。如果这两个字符串是数字，则结果主要由以数字作为年份或页码的出版物组成。因为查询的结果类型不是由用户指定的，所以我们将最低的共同祖先称为最近的概念。最后给出了一个书目领域的实例，验证了该算子的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Querying XML documents made easy: nearest concept queries

Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). None the less, it is this hierarchical structure that forms the basis of XML query languages. We exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: e.g., given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author's name and a year mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept. We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings 17th International Conference on Data Engineering

自引率

0.00%

发文量