分解下映射的最优排序与选择

21st International Conference on Data Engineering (ICDE'05) Pub Date : 2005-04-05 DOI:10.1109/ICDE.2005.97

Thomas Neumann, S. Helmer, G. Moerkotte

{"title":"分解下映射的最优排序与选择","authors":"Thomas Neumann, S. Helmer, G. Moerkotte","doi":"10.1109/ICDE.2005.97","DOIUrl":null,"url":null,"abstract":"The query optimizer of a database system is confronted with two aspects when handling user-defined functions (UDFs) in query predicates: the vast differences in evaluation costs between UDFs (and other functions) and multiple calls of the same (expensive) UDF The former is dealt with by ordering the evaluation of the predicates optimally, the latter by identifying common subexpressions and thereby avoiding costly recomputation. Current approaches order n predicates optimally (neglecting factorization) in O(nlogn). Their result may deviate significantly from the optimal solution under factorization. We formalize the problem of finding optimal orderings under factorization and prove that it is NP-hard. Furthermore, we show how to improve on the run time of the brute-force algorithm (which computes all possible orderings) by presenting different enhanced algorithms. Although in the worst case these algorithms obviously still behave exponentially, our experiments demonstrate that for real-life examples their performance is much better.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"On the optimal ordering of maps and selections under factorization\",\"authors\":\"Thomas Neumann, S. Helmer, G. Moerkotte\",\"doi\":\"10.1109/ICDE.2005.97\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The query optimizer of a database system is confronted with two aspects when handling user-defined functions (UDFs) in query predicates: the vast differences in evaluation costs between UDFs (and other functions) and multiple calls of the same (expensive) UDF The former is dealt with by ordering the evaluation of the predicates optimally, the latter by identifying common subexpressions and thereby avoiding costly recomputation. Current approaches order n predicates optimally (neglecting factorization) in O(nlogn). Their result may deviate significantly from the optimal solution under factorization. We formalize the problem of finding optimal orderings under factorization and prove that it is NP-hard. Furthermore, we show how to improve on the run time of the brute-force algorithm (which computes all possible orderings) by presenting different enhanced algorithms. Although in the worst case these algorithms obviously still behave exponentially, our experiments demonstrate that for real-life examples their performance is much better.\",\"PeriodicalId\":297231,\"journal\":{\"name\":\"21st International Conference on Data Engineering (ICDE'05)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"21st International Conference on Data Engineering (ICDE'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2005.97\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"21st International Conference on Data Engineering (ICDE'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2005.97","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

数据库系统的查询优化器在处理查询谓词中的用户定义函数(UDF)时面临两个方面的问题:UDF(和其他函数)之间计算成本的巨大差异，以及对同一(昂贵的)UDF的多次调用。前者通过对谓词的求值进行最优排序来处理，后者通过标识公共子表达式来处理，从而避免代价高昂的重新计算。目前的方法在O(nlogn)内最优地使用O(n)个谓词(忽略因子分解)。它们的结果可能与因式分解下的最优解有很大的偏差。我们形式化了在分解下寻找最优排序的问题，并证明了它是np困难的。此外，我们还展示了如何通过展示不同的增强算法来改进暴力破解算法(计算所有可能的排序)的运行时间。尽管在最坏的情况下，这些算法显然仍然表现得呈指数级增长，但我们的实验表明，对于现实生活中的例子，它们的性能要好得多。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On the optimal ordering of maps and selections under factorization

The query optimizer of a database system is confronted with two aspects when handling user-defined functions (UDFs) in query predicates: the vast differences in evaluation costs between UDFs (and other functions) and multiple calls of the same (expensive) UDF The former is dealt with by ordering the evaluation of the predicates optimally, the latter by identifying common subexpressions and thereby avoiding costly recomputation. Current approaches order n predicates optimally (neglecting factorization) in O(nlogn). Their result may deviate significantly from the optimal solution under factorization. We formalize the problem of finding optimal orderings under factorization and prove that it is NP-hard. Furthermore, we show how to improve on the run time of the brute-force algorithm (which computes all possible orderings) by presenting different enhanced algorithms. Although in the worst case these algorithms obviously still behave exponentially, our experiments demonstrate that for real-life examples their performance is much better.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

21st International Conference on Data Engineering (ICDE'05)

自引率

0.00%

发文量