Automatic View Generation with Deep Learning and Reinforcement Learning

Haitao Yuan, Guoliang Li, Ling Feng, Ji Sun, Yue Han
{"title":"Automatic View Generation with Deep Learning and Reinforcement Learning","authors":"Haitao Yuan, Guoliang Li, Ling Feng, Ji Sun, Yue Han","doi":"10.1109/ICDE48307.2020.00133","DOIUrl":null,"url":null,"abstract":"Materializing views is an important method to reduce redundant computations in DBMS, especially for processing large scale analytical queries. However, many existing methods still need DBAs to manually generate materialized views, which are not scalable to a large number of database instances, especially on the cloud database. To address this problem, we propose an automatic view generation method which judiciously selects \"highly beneficial\" subqueries to generate materialized views. However, there are two challenges. (1) How to estimate the benefit of using a materialized view for a queryƒ (2) How to select optimal subqueries to generate materialized viewsƒ To address the first challenge, we propose a neural network based method to estimate the benefit of using a materialized view to answer a query. In particular, we extract significant features from different perspectives and design effective encoding models to transform these features into hidden representations. To address the second challenge, we model this problem to an ILP (Integer Linear Programming) problem, which aims to maximize the utility by selecting optimal subqueries to materialize. We design an iterative optimization method to select subqueries to materialize. However, this method cannot guarantee the convergence of the solution. To address this issue, we model the iterative optimization process as an MDP (Markov Decision Process) and use the deep reinforcement learning model to solve the problem. Extensive experiments show that our method outperforms existing solutions by 28.4%, 8.8% and 31.7% on three real-world datasets.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"37 1","pages":"1501-1512"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE48307.2020.00133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 44

Abstract

Materializing views is an important method to reduce redundant computations in DBMS, especially for processing large scale analytical queries. However, many existing methods still need DBAs to manually generate materialized views, which are not scalable to a large number of database instances, especially on the cloud database. To address this problem, we propose an automatic view generation method which judiciously selects "highly beneficial" subqueries to generate materialized views. However, there are two challenges. (1) How to estimate the benefit of using a materialized view for a queryƒ (2) How to select optimal subqueries to generate materialized viewsƒ To address the first challenge, we propose a neural network based method to estimate the benefit of using a materialized view to answer a query. In particular, we extract significant features from different perspectives and design effective encoding models to transform these features into hidden representations. To address the second challenge, we model this problem to an ILP (Integer Linear Programming) problem, which aims to maximize the utility by selecting optimal subqueries to materialize. We design an iterative optimization method to select subqueries to materialize. However, this method cannot guarantee the convergence of the solution. To address this issue, we model the iterative optimization process as an MDP (Markov Decision Process) and use the deep reinforcement learning model to solve the problem. Extensive experiments show that our method outperforms existing solutions by 28.4%, 8.8% and 31.7% on three real-world datasets.
基于深度学习和强化学习的自动视图生成
物化视图是减少数据库管理系统中冗余计算的一种重要方法,特别是在处理大规模分析查询时。但是,许多现有的方法仍然需要dba手动生成物化视图,这不能扩展到大量的数据库实例,特别是在云数据库上。为了解决这个问题,我们提出了一种自动视图生成方法,该方法明智地选择“高度有益”的子查询来生成物化视图。然而,有两个挑战。(1)如何估计使用物化视图进行查询的好处(2)如何选择最优子查询来生成物化视图为了解决第一个挑战,我们提出了一种基于神经网络的方法来估计使用物化视图回答查询的好处。特别是,我们从不同的角度提取重要的特征,并设计有效的编码模型,将这些特征转换为隐藏的表征。为了解决第二个挑战,我们将该问题建模为ILP(整数线性规划)问题,该问题旨在通过选择最优子查询来实现效用最大化。我们设计了一种迭代优化方法来选择要实现的子查询。但该方法不能保证解的收敛性。为了解决这个问题,我们将迭代优化过程建模为MDP(马尔可夫决策过程),并使用深度强化学习模型来解决这个问题。大量的实验表明,我们的方法在三个真实数据集上比现有的解决方案分别高出28.4%、8.8%和31.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信