{"title":"AutoQuo: An Adaptive plan optimizer with reinforcement learning for query plan selection","authors":"Xiaoqiao Xiong , Jiong Yu , Zhenzhen He","doi":"10.1016/j.knosys.2024.112664","DOIUrl":null,"url":null,"abstract":"<div><div>An efficient execution plan generation is crucial for optimizing database queries. In exploring large table spaces to identify the most optimal table join orders, traditional cost-based optimizers may encounter challenges when confronted with complicated queries. Thus, learning-based optimizers have recently been proposed to leverage past experience and generate high-quality execution plans. However, these optimizers demonstrate limited generalization capabilities for workloads with diverse distributions.</div><div>In this study, an adaptive plan selector based on reinforcement learning is proposed to address these issues. However, three challenges remain: (1) How to generate optimal multi-table join orders? We adopt an exploration–exploitation strategy to traverse the vast search space composed of candidate tables, thereby evaluating the significance of each table. Long short-term memory (LSTM) networks are subsequently used to predict the performance of join orders and generate high-quality candidate plans. (2) How to automatically learn new features in novel datasets? We employ the Actor–Critic strategy, which involves jointly cross-training the policy and value networks. By adjusting the parameters based on real feedback obtained from the database, the new datasets are automatically learnt. (3) How to automatically select the best plan? We introduce a constraint-aware optimal plan selection model that captures the relationship between constraints and plans. This model guides the selection of the best plan under constraints of execution time, cardinality, cost, and mean-squared error (MSE). The experimental results on real datasets demonstrated the superiority of the proposed approach over state-of-the-art baselines. Compared with PostgreSQL, we observed a reduction of 29.73% in total latency and 28.36% in tail latency.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"306 ","pages":"Article 112664"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095070512401298X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
An efficient execution plan generation is crucial for optimizing database queries. In exploring large table spaces to identify the most optimal table join orders, traditional cost-based optimizers may encounter challenges when confronted with complicated queries. Thus, learning-based optimizers have recently been proposed to leverage past experience and generate high-quality execution plans. However, these optimizers demonstrate limited generalization capabilities for workloads with diverse distributions.
In this study, an adaptive plan selector based on reinforcement learning is proposed to address these issues. However, three challenges remain: (1) How to generate optimal multi-table join orders? We adopt an exploration–exploitation strategy to traverse the vast search space composed of candidate tables, thereby evaluating the significance of each table. Long short-term memory (LSTM) networks are subsequently used to predict the performance of join orders and generate high-quality candidate plans. (2) How to automatically learn new features in novel datasets? We employ the Actor–Critic strategy, which involves jointly cross-training the policy and value networks. By adjusting the parameters based on real feedback obtained from the database, the new datasets are automatically learnt. (3) How to automatically select the best plan? We introduce a constraint-aware optimal plan selection model that captures the relationship between constraints and plans. This model guides the selection of the best plan under constraints of execution time, cardinality, cost, and mean-squared error (MSE). The experimental results on real datasets demonstrated the superiority of the proposed approach over state-of-the-art baselines. Compared with PostgreSQL, we observed a reduction of 29.73% in total latency and 28.36% in tail latency.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.