Proceedings of the Workshop on Testing Database Systems最新文献

Testing query execution engines with mutations 测试带有突变的查询执行引擎

Proceedings of the Workshop on Testing Database Systems Pub Date : 2020-06-19 DOI: 10.1145/3395032.3395322

Xinyue Chen, Chenglong Wang, Alvin Cheung

引用次数: 5

On another level: how to debug compiling query engines 在另一个层面上:如何调试编译查询引擎

Proceedings of the Workshop on Testing Database Systems Pub Date : 2020-06-19 DOI: 10.1145/3395032.3395321

T. Kersten, Thomas Neumann

{"title":"On another level: how to debug compiling query engines","authors":"T. Kersten, Thomas Neumann","doi":"10.1145/3395032.3395321","DOIUrl":"https://doi.org/10.1145/3395032.3395321","url":null,"abstract":"Compilation-based query engines generate and compile code at runtime, which is then run to get the query result. In this process there are two levels of source code involved: The code of the code generator itself and the code that is generated at runtime. This can make debugging quite indirect, as a fault in the generated code was caused by an error in the generator. To find the error, we have to look at both, the generated code and the code that generated it. Current debugging technology is not equipped to handle this situation. For example, GNU's gdb only offers facilities to inspect one source line, but not multiple source levels. Also, current debuggers are not able to reconstruct additional program state for further source levels, thus, context is missing during debugging. In this paper, we show how to build a multi-level debugger for generated queries that solves these issues.We propose to use a timetravelling debugger to provide context information for compile-time and runtime, thus providing full interactive debugging capabilities for every source level.We also present how to build such a debugger with low engineering effort by combining existing tool chains.","PeriodicalId":436501,"journal":{"name":"Proceedings of the Workshop on Testing Database Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134502807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Workload merging potential in SAP Hybris SAP Hybris的工作负载合并潜力

Proceedings of the Workshop on Testing Database Systems Pub Date : 2020-06-19 DOI: 10.1145/3395032.3395326

Robin Rehrmann, Martin Keppner, Wolfgang Lehner, Carsten Binnig, Arne Schwarz

引用次数: 5

FacetE 很有情趣的

Proceedings of the Workshop on Testing Database Systems Pub Date : 2020-05-22 DOI: 10.1145/3395032.3395325

Michael Günther, Paul Sikorski, M. Thiele, W. Lehner

引用次数: 1

CoreBigBench CoreBigBench

Proceedings of the Workshop on Testing Database Systems Pub Date : 2020-05-22 DOI: 10.1145/3395032.3395324

Todor Ivanov, A. Ghazal, A. Crolotte, Pekka Kostamaa, Yoseph Ghazal

{"title":"CoreBigBench","authors":"Todor Ivanov, A. Ghazal, A. Crolotte, Pekka Kostamaa, Yoseph Ghazal","doi":"10.1145/3395032.3395324","DOIUrl":"https://doi.org/10.1145/3395032.3395324","url":null,"abstract":"Significant effort was put into big data benchmarking with focus on end-to-end applications. While covering basic functionalities implicitly, the details of the individual contributions to the overall performance are hidden. As a result, end-to-end benchmarks could be biased toward certain basic functions. Micro-benchmarks are more explicit at covering basic functionalities but they are usually targeted at some highly specialized functions. In this paper we present CoreBigBench, a benchmark that focuses on the most common big data engines/platforms functionalities like scans, two way joins, common UDF execution and more. These common functionalities are benchmarked over relational and key-value data models which covers majority of data models. The benchmark consists of 22 queries applied to sales data and key-value web logs covering the basic functionalities. We ran CoreBigBench on Hive as a proof of concept and verified that the benchmark is easy to deploy and collected performance data. Finally, we believe that CoreBigBench is a good fit for commercial big data engines performance testing focused on basic engine functionalities not covered in end-to-end benchmarks.","PeriodicalId":436501,"journal":{"name":"Proceedings of the Workshop on Testing Database Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122428413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

SparkFuzz

Proceedings of the Workshop on Testing Database Systems Pub Date : 2020-05-22 DOI: 10.1145/3395032.3395327

Bogdan Ghit, Nicolás Poggi, J. Rosen, Reynold Xin, P. Boncz

引用次数: 7

Automated system performance testing at MongoDB MongoDB的自动化系统性能测试

Proceedings of the Workshop on Testing Database Systems Pub Date : 2020-04-17 DOI: 10.1145/3395032.3395323

Henrik Ingo, D. Daly

引用次数: 14

Finding the Pitfalls in Query Performance 查找查询性能中的缺陷

Proceedings of the Workshop on Testing Database Systems Pub Date : 2018-06-15 DOI: 10.1145/3209950.3209951

M. Kersten, P. Koutsourakis, Y. Zhang

引用次数: 4

Snowtrail

Proceedings of the Workshop on Testing Database Systems Pub Date : 2018-06-15 DOI: 10.1145/3209950.3209958

Jiaqi Yan, Q. Jin, Shrainik Jain, Stratis Viglas, Allison Lee

{"title":"Snowtrail","authors":"Jiaqi Yan, Q. Jin, Shrainik Jain, Stratis Viglas, Allison Lee","doi":"10.1145/3209950.3209958","DOIUrl":"https://doi.org/10.1145/3209950.3209958","url":null,"abstract":"Database as a service provided on cloud computing platforms has been rapidly gaining popularity in recent years. The Snowflake Elastic Data Warehouse (henceforth referred to as Snowflake) is a cloud database service provided by Snowflake Computing. The cloud native capabilities of new database services such as Snowflake bring exciting new opportunities for database testing. First, Snowflake maintains extensive knowledge of historical customer queries, including both the query text and corresponding system configurations. Second, Snowflake is multi-tenant, which provides easy access to metadata and data that can be used to rerun customer queries from a privileged role. Furthermore, the elastic nature of Snowflake's data warehouse service allows testing with these queries using a separate set of resources without impacting the customer's production workload. This paper presents Snowtrail, an infrastructure developed within Snowflake for testing using customer production queries with result obfuscation. Running tests with production queries provides us with direct insight into the impact of improvements and new features on customer workloads. It enables testing on queries of more shapes and complexity than can be manually constructed by developers. Snowtrail is also used to help ensure the stability of the online upgrade process of the system.","PeriodicalId":436501,"journal":{"name":"Proceedings of the Workshop on Testing Database Systems","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116419131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Get Real: How Benchmarks Fail to Represent the Real World 现实:基准如何不能代表真实世界

Proceedings of the Workshop on Testing Database Systems Pub Date : 2018-06-15 DOI: 10.1145/3209950.3209952

Adrian Vogelsgesang, Michael Haubenschild, Jan Finis, A. Kemper, Viktor Leis, Tobias Mühlbauer, Thomas Neumann, M. Then

{"title":"Get Real: How Benchmarks Fail to Represent the Real World","authors":"Adrian Vogelsgesang, Michael Haubenschild, Jan Finis, A. Kemper, Viktor Leis, Tobias Mühlbauer, Thomas Neumann, M. Then","doi":"10.1145/3209950.3209952","DOIUrl":"https://doi.org/10.1145/3209950.3209952","url":null,"abstract":"Industrial as well as academic analytics systems are usually evaluated based on well-known standard benchmarks, such as TPC-H or TPC-DS. These benchmarks test various components of the DBMS including the join optimizer, the implementation of the join and aggregation operators, concurrency control and the scheduler. However, these benchmarks fall short of evaluating the \"real\" challenges imposed by modern BI systems, such as Tableau, that emit machine-generated query workloads. This paper reports a comprehensive study based on a set of more than 60k real-world BI data repositories together with their generated query workload. The machine-generated workload posed by BI tools differs from the \"hand-crafted\" benchmark queries in multiple ways: Structurally simple relational operator trees often come with extremely complex scalar expressions such that expression evaluation becomes the limiting factor. At the same time, we also encountered much more complex relational operator trees than covered by benchmarks. This long tail in both, operator tree and expression complexity, is not adequately represented in standard benchmarks. We contribute various statistics gathered from the large dataset, e.g., data type distributions, operator frequency, string length distribution and expression complexity. We hope our study gives an impetus to database researchers and benchmark designers alike to address the relevant problems in future projects and to enable better database support for data exploration systems which become more and more important in the Big Data era.","PeriodicalId":436501,"journal":{"name":"Proceedings of the Workshop on Testing Database Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134112692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63