{"title":"用于收集XML统计信息和估计查询基数的灵活基础设施","authors":"J. Freire, Maya Ramanath, Lingzhi Zhang","doi":"10.1109/ICDE.2004.1320085","DOIUrl":null,"url":null,"abstract":"A key component of XML data management systems is the result size estimator, which estimates the cardinalities of user queries. Estimated cardinalities are needed in a variety of tasks, including query optimization and cost-based storage design; and they can also be used to give users early feedback about the expected outcome of their queries. In contrast to previously proposed result estimators, which use specialized data structures and estimation algorithms, StatiX uses histograms to uniformly capture both the structural and value skew present in documents. The original version of StatiX was built as a proof of concept. With the goal of making the system publicly available, we have built StatiX++, a new and improved version of StatiX, which extends the original system in significant ways. In this demonstration, we show the key features of StatiX++.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"84 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A flexible infrastructure for gathering XML statistics and estimating query cardinality\",\"authors\":\"J. Freire, Maya Ramanath, Lingzhi Zhang\",\"doi\":\"10.1109/ICDE.2004.1320085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A key component of XML data management systems is the result size estimator, which estimates the cardinalities of user queries. Estimated cardinalities are needed in a variety of tasks, including query optimization and cost-based storage design; and they can also be used to give users early feedback about the expected outcome of their queries. In contrast to previously proposed result estimators, which use specialized data structures and estimation algorithms, StatiX uses histograms to uniformly capture both the structural and value skew present in documents. The original version of StatiX was built as a proof of concept. With the goal of making the system publicly available, we have built StatiX++, a new and improved version of StatiX, which extends the original system in significant ways. In this demonstration, we show the key features of StatiX++.\",\"PeriodicalId\":358862,\"journal\":{\"name\":\"Proceedings. 20th International Conference on Data Engineering\",\"volume\":\"84 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. 20th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2004.1320085\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 20th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2004.1320085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A flexible infrastructure for gathering XML statistics and estimating query cardinality
A key component of XML data management systems is the result size estimator, which estimates the cardinalities of user queries. Estimated cardinalities are needed in a variety of tasks, including query optimization and cost-based storage design; and they can also be used to give users early feedback about the expected outcome of their queries. In contrast to previously proposed result estimators, which use specialized data structures and estimation algorithms, StatiX uses histograms to uniformly capture both the structural and value skew present in documents. The original version of StatiX was built as a proof of concept. With the goal of making the system publicly available, we have built StatiX++, a new and improved version of StatiX, which extends the original system in significant ways. In this demonstration, we show the key features of StatiX++.