Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management最新文献

筛选
英文 中文
Research challenges in deep reinforcement learning-based join query optimization 基于深度强化学习的连接查询优化研究挑战
R. Guo, Khuzaima S. Daudjee
{"title":"Research challenges in deep reinforcement learning-based join query optimization","authors":"R. Guo, Khuzaima S. Daudjee","doi":"10.1145/3401071.3401657","DOIUrl":"https://doi.org/10.1145/3401071.3401657","url":null,"abstract":"The order in which relations are joined and the physical join operators used are two aspects of query plans which have a significant impact on the execution latency of join queries. However, the set of valid query plans grows exponentially with the number of relations to be joined. Hence, it becomes computationally expensive to enumerate all such plans for a complex join query. Recently, several deep reinforcement learning (DRL) based approaches propose using neural networks to construct a query plan. They demonstrate that efficient query plans can be found without exhaustively enumerating the search space. We integrated our implementation of a DRL-based solution to optimize join order and operators into the PostgreSQL query optimizer. In practice, we found limitations in the quality of the query plans chosen which are not addressed in existing approaches. In this paper we highlight some of these limitations and propose future research challenges along with potential solutions.","PeriodicalId":371439,"journal":{"name":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116571805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Bandit join: preliminary results 土匪加入:初步结果
Vahid Ghadakchi, Mian Xie, Arash Termehchy
{"title":"Bandit join: preliminary results","authors":"Vahid Ghadakchi, Mian Xie, Arash Termehchy","doi":"10.1145/3401071.3401655","DOIUrl":"https://doi.org/10.1145/3401071.3401655","url":null,"abstract":"Join is arguably the most costly and frequently used operation in relational query processing. Join algorithms usually spend the majority of their time on scanning and attempting to join the parts of the base relations that do not satisfy the join condition and do not generate any results. This causes slow response time, particularly, in interactive and exploratory environments where users would like real-time performance. In this paper, we outline our vision on using online learning and adaptation to execute joins efficiently. In our approach, scan operators that precede a join, learn which parts of the relations are more likely to join during the query execution and produce more results faster by doing fewer I/O accesses. Our empirical studies using standard benchmarks indicate that this approach outperforms similar methods considerably.","PeriodicalId":371439,"journal":{"name":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122966570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Best of both worlds: combining traditional and machine learning models for cardinality estimation 两全其美:结合传统和机器学习模型进行基数估计
Lucas Woltmann, Claudio Hartmann, Dirk Habich, Wolfgang Lehner
{"title":"Best of both worlds: combining traditional and machine learning models for cardinality estimation","authors":"Lucas Woltmann, Claudio Hartmann, Dirk Habich, Wolfgang Lehner","doi":"10.1145/3401071.3401658","DOIUrl":"https://doi.org/10.1145/3401071.3401658","url":null,"abstract":"Cardinality estimation is a high-profile technique in database management systems with a serious impact on query performance. Thus, a lot of traditional approaches such as histograms-based or sampling-based methods have been developed over the last decades. With the advance of Machine Learning (ML) into the database world, cardinality estimation profits from several methods improving its quality as shown in different recent papers. However, neither an ML model nor a traditional approach meets all requirements for cardinality estimation, so that a one size fits all approach is difficult to imagine. For that reason, we advocate a better interlacing of ML models and traditional approaches for cardinality estimation and thoroughly consider their potential, advantages, and disadvantages in this paper. We start by proposing a classification of different estimation techniques and their usability for cardinality estimation. Then, we motivate a novel hybrid approach as the core proof of concept of this paper which uses the best of both worlds: ML models and the proven histogram approach. For this, we show in which cases it is beneficial to use ML models or when we can trust the traditional estimators. We evaluate our hybrid approach on two real-world data sets and conclude what can be done to improve the coexistence of traditional and ML approaches in DBMS. With all our proposals, we use ML to improve DBMS without abandoning years of valuable research in cardinality estimation.","PeriodicalId":371439,"journal":{"name":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131830698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
PartLy 部分
A. S. Abdelhamid, Walid G. Aref
{"title":"PartLy","authors":"A. S. Abdelhamid, Walid G. Aref","doi":"10.1145/3401071.3401660","DOIUrl":"https://doi.org/10.1145/3401071.3401660","url":null,"abstract":"Data partitioning plays a critical role in data stream processing. Current data partitioning techniques use simple, static heuristics that do not incorporate feedback about the quality of the partitioning decision (i.e., fire and forget strategy). Hence, the data partitioner often repeatedly chooses the same decision. In this paper, we argue that reinforcement learning techniques can be applied to address this problem. The use of artificial neural networks can facilitate learning of efficient partitioning policies. We identify the challenges that emerge when applying machine learning techniques to the data partitioning problem for distributed data stream processing. Furthermore, we introduce PartLy, a proof-of-concept data partitioner, and present preliminary results that indicate PartLy's potential to match the performance of state-of-the-art techniques in terms of partitioning quality, while minimizing storage and processing overheads.","PeriodicalId":371439,"journal":{"name":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128777163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automated tuning of query degree of parallelism via machine learning 通过机器学习自动调优查询并行度
Zhiwei Fan, Rathijit Sen, Paraschos Koutris, Aws Albarghouthi
{"title":"Automated tuning of query degree of parallelism via machine learning","authors":"Zhiwei Fan, Rathijit Sen, Paraschos Koutris, Aws Albarghouthi","doi":"10.1145/3401071.3401656","DOIUrl":"https://doi.org/10.1145/3401071.3401656","url":null,"abstract":"Determining the degree of parallelism (DOP) for query execution is of great importance to both performance and resource provisioning. However, recent work that applies machine learning (ML) to query optimization and query performance prediction in relational database management systems (RDBMSs) has ignored the effect of intra-query parallelism. In this work, we argue that determining the optimal or near-optimal DOP for query execution is a fundamental and challenging task that benefits both query performance and cost-benefit tradeoffs. We then present promising preliminary results on how ML techniques can be applied to automate DOP tuning. We conclude with a list of challenges we encountered, as well as future directions for our work.","PeriodicalId":371439,"journal":{"name":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125271733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
RadixSpline: a single-pass learned index RadixSpline:单次学习索引
Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, A. Kemper, Tim Kraska, Thomas Neumann
{"title":"RadixSpline: a single-pass learned index","authors":"Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, A. Kemper, Tim Kraska, Thomas Neumann","doi":"10.1145/3401071.3401659","DOIUrl":"https://doi.org/10.1145/3401071.3401659","url":null,"abstract":"Recent research has shown that learned models can outperform state-of-the-art index structures in size and lookup performance. While this is a very promising result, existing learned structures are often cumbersome to implement and are slow to build. In fact, most approaches that we are aware of require multiple training passes over the data. We introduce RadixSpline (RS), a learned index that can be built in a single pass over the data and is competitive with state-of-the-art learned index models, like RMI, in size and lookup performance. We evaluate RS using the SOSD benchmark and show that it achieves competitive results on all datasets, despite the fact that it only has two parameters.","PeriodicalId":371439,"journal":{"name":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131459467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 112
Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management 第三届利用人工智能技术进行数据管理国际研讨会论文集
{"title":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","authors":"","doi":"10.1145/3401071","DOIUrl":"https://doi.org/10.1145/3401071","url":null,"abstract":"","PeriodicalId":371439,"journal":{"name":"Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124782096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信