AutoQuo：利用强化学习选择查询计划的自适应计划优化器

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2024-11-06 DOI:10.1016/j.knosys.2024.112664

Xiaoqiao Xiong , Jiong Yu , Zhenzhen He

{"title":"AutoQuo：利用强化学习选择查询计划的自适应计划优化器","authors":"Xiaoqiao Xiong , Jiong Yu , Zhenzhen He","doi":"10.1016/j.knosys.2024.112664","DOIUrl":null,"url":null,"abstract":"<div><div>An efficient execution plan generation is crucial for optimizing database queries. In exploring large table spaces to identify the most optimal table join orders, traditional cost-based optimizers may encounter challenges when confronted with complicated queries. Thus, learning-based optimizers have recently been proposed to leverage past experience and generate high-quality execution plans. However, these optimizers demonstrate limited generalization capabilities for workloads with diverse distributions.</div><div>In this study, an adaptive plan selector based on reinforcement learning is proposed to address these issues. However, three challenges remain: (1) How to generate optimal multi-table join orders? We adopt an exploration–exploitation strategy to traverse the vast search space composed of candidate tables, thereby evaluating the significance of each table. Long short-term memory (LSTM) networks are subsequently used to predict the performance of join orders and generate high-quality candidate plans. (2) How to automatically learn new features in novel datasets? We employ the Actor–Critic strategy, which involves jointly cross-training the policy and value networks. By adjusting the parameters based on real feedback obtained from the database, the new datasets are automatically learnt. (3) How to automatically select the best plan? We introduce a constraint-aware optimal plan selection model that captures the relationship between constraints and plans. This model guides the selection of the best plan under constraints of execution time, cardinality, cost, and mean-squared error (MSE). The experimental results on real datasets demonstrated the superiority of the proposed approach over state-of-the-art baselines. Compared with PostgreSQL, we observed a reduction of 29.73% in total latency and 28.36% in tail latency.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"306 ","pages":"Article 112664"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AutoQuo: An Adaptive plan optimizer with reinforcement learning for query plan selection\",\"authors\":\"Xiaoqiao Xiong , Jiong Yu , Zhenzhen He\",\"doi\":\"10.1016/j.knosys.2024.112664\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>An efficient execution plan generation is crucial for optimizing database queries. In exploring large table spaces to identify the most optimal table join orders, traditional cost-based optimizers may encounter challenges when confronted with complicated queries. Thus, learning-based optimizers have recently been proposed to leverage past experience and generate high-quality execution plans. However, these optimizers demonstrate limited generalization capabilities for workloads with diverse distributions.</div><div>In this study, an adaptive plan selector based on reinforcement learning is proposed to address these issues. However, three challenges remain: (1) How to generate optimal multi-table join orders? We adopt an exploration–exploitation strategy to traverse the vast search space composed of candidate tables, thereby evaluating the significance of each table. Long short-term memory (LSTM) networks are subsequently used to predict the performance of join orders and generate high-quality candidate plans. (2) How to automatically learn new features in novel datasets? We employ the Actor–Critic strategy, which involves jointly cross-training the policy and value networks. By adjusting the parameters based on real feedback obtained from the database, the new datasets are automatically learnt. (3) How to automatically select the best plan? We introduce a constraint-aware optimal plan selection model that captures the relationship between constraints and plans. This model guides the selection of the best plan under constraints of execution time, cardinality, cost, and mean-squared error (MSE). The experimental results on real datasets demonstrated the superiority of the proposed approach over state-of-the-art baselines. Compared with PostgreSQL, we observed a reduction of 29.73% in total latency and 28.36% in tail latency.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"306 \",\"pages\":\"Article 112664\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S095070512401298X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095070512401298X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

高效的执行计划生成对于优化数据库查询至关重要。在探索大型表空间以确定最优表连接顺序时，传统的基于成本的优化器在面对复杂查询时可能会遇到挑战。因此，最近有人提出了基于学习的优化器，以利用过去的经验生成高质量的执行计划。本研究提出了一种基于强化学习的自适应计划选择器来解决这些问题。然而，仍然存在三个挑战：(1) 如何生成最优的多表连接顺序？我们采用探索-开发策略来遍历由候选表组成的巨大搜索空间，从而评估每个表的重要性。随后，利用长短期记忆（LSTM）网络预测连接顺序的性能，并生成高质量的候选计划。(2) 如何在新数据集中自动学习新特征？我们采用了 "行为批判"（Actor-Critic）策略，即联合交叉训练策略网络和价值网络。根据从数据库中获得的真实反馈来调整参数，从而自动学习新的数据集。(3) 如何自动选择最佳计划？我们引入了一个约束感知最优计划选择模型，该模型捕捉了约束和计划之间的关系。该模型可在执行时间、卡数、成本和均方误差（MSE）等约束条件下指导选择最佳计划。在真实数据集上的实验结果表明，所提出的方法优于最先进的基线方法。与 PostgreSQL 相比，我们观察到总延迟时间缩短了 29.73%，尾部延迟时间缩短了 28.36%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AutoQuo: An Adaptive plan optimizer with reinforcement learning for query plan selection

An efficient execution plan generation is crucial for optimizing database queries. In exploring large table spaces to identify the most optimal table join orders, traditional cost-based optimizers may encounter challenges when confronted with complicated queries. Thus, learning-based optimizers have recently been proposed to leverage past experience and generate high-quality execution plans. However, these optimizers demonstrate limited generalization capabilities for workloads with diverse distributions.

In this study, an adaptive plan selector based on reinforcement learning is proposed to address these issues. However, three challenges remain: (1) How to generate optimal multi-table join orders? We adopt an exploration–exploitation strategy to traverse the vast search space composed of candidate tables, thereby evaluating the significance of each table. Long short-term memory (LSTM) networks are subsequently used to predict the performance of join orders and generate high-quality candidate plans. (2) How to automatically learn new features in novel datasets? We employ the Actor–Critic strategy, which involves jointly cross-training the policy and value networks. By adjusting the parameters based on real feedback obtained from the database, the new datasets are automatically learnt. (3) How to automatically select the best plan? We introduce a constraint-aware optimal plan selection model that captures the relationship between constraints and plans. This model guides the selection of the best plan under constraints of execution time, cardinality, cost, and mean-squared error (MSE). The experimental results on real datasets demonstrated the superiority of the proposed approach over state-of-the-art baselines. Compared with PostgreSQL, we observed a reduction of 29.73% in total latency and 28.36% in tail latency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.