路径树:XPath查询选择性估计的文档概要

M. Alrammal, G. Hains, Mohamed Zergaoui
{"title":"路径树:XPath查询选择性估计的文档概要","authors":"M. Alrammal, G. Hains, Mohamed Zergaoui","doi":"10.1109/CISIS.2011.53","DOIUrl":null,"url":null,"abstract":"XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by [1] and used by [25], and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from \\cite{Gou:Eff} LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.","PeriodicalId":203206,"journal":{"name":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Path Tree: Document Synopsis for XPath Query Selectivity Estimation\",\"authors\":\"M. Alrammal, G. Hains, Mohamed Zergaoui\",\"doi\":\"10.1109/CISIS.2011.53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by [1] and used by [25], and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from \\\\cite{Gou:Eff} LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.\",\"PeriodicalId\":203206,\"journal\":{\"name\":\"2011 International Conference on Complex, Intelligent, and Software Intensive Systems\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Complex, Intelligent, and Software Intensive Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISIS.2011.53\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISIS.2011.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

XML是在Internet上操作数据的最重要的标准之一。然而,查询大量XML数据对于一些计算密集型应用程序来说是一个瓶颈。一种解决方案是以流模式预处理文档,使用与文档深度和查询选择性大致成比例的资源。有限的处理空间可以容纳更大的文档。但实际节省的成本差异太大,以至于难以预测。为了克服流处理的这一限制,我们提出了路径树概要数据结构的一种新应用。这样的摘要提供了对原始文档的简洁描述,对于处理任务(如选择性估计和查询答案近似)具有低计算开销和高准确性。本文正式定义了由[1]非正式引入、[25]使用的路径树概要,并提出了一种新的流式算法来构造它。我们还提出了一个在线流查询系统,该系统能够在准确回答给定查询之前估计其成本。核心算法改编自\cite{Gou:Eff} LQ,我们将其应用于路径树遍历、成本估计、查询处理甚至优化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Path Tree: Document Synopsis for XPath Query Selectivity Estimation
XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by [1] and used by [25], and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from \cite{Gou:Eff} LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信