{"title":"路径树:XPath查询选择性估计的文档概要","authors":"M. Alrammal, G. Hains, Mohamed Zergaoui","doi":"10.1109/CISIS.2011.53","DOIUrl":null,"url":null,"abstract":"XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by [1] and used by [25], and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from \\cite{Gou:Eff} LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.","PeriodicalId":203206,"journal":{"name":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Path Tree: Document Synopsis for XPath Query Selectivity Estimation\",\"authors\":\"M. Alrammal, G. Hains, Mohamed Zergaoui\",\"doi\":\"10.1109/CISIS.2011.53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by [1] and used by [25], and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from \\\\cite{Gou:Eff} LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.\",\"PeriodicalId\":203206,\"journal\":{\"name\":\"2011 International Conference on Complex, Intelligent, and Software Intensive Systems\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Complex, Intelligent, and Software Intensive Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISIS.2011.53\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Complex, Intelligent, and Software Intensive Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISIS.2011.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Path Tree: Document Synopsis for XPath Query Selectivity Estimation
XML is one of the most important standards for manipulating data on the Internet. However, querying large volumes of XML data represents a bottleneck for several computationally intensive applications. A solution is to pre-process the document in streaming mode with resources approximately proportional to document depth and query selectivity. Limited processing space can then accommodate much larger documents. But the actual savings vary so much as to make them unpredictable. To overcome this limitation of stream-processing we propose a new application of the path tree synopsis data structure. Such a synopsis provides a succinct description of the original document with low computational overhead and high accuracy for processing tasks like selectivity estimation and query answer approximation. In this paper, we formally define the path tree synopsis, informally introduced by [1] and used by [25], and propose a new streaming algorithm to construct it. We also present an online stream-querying system able to estimate the cost for a given query before answering it accurately. The core algorithm is adapted from \cite{Gou:Eff} LQ, we apply it to path tree traversal, cost estimation, query processing and even optimizations.