Towards a General Framework for ML-based Self-tuning Databases

Proceedings of the 1st Workshop on Machine Learning and Systems Pub Date : 2020-11-16 DOI:10.1145/3437984.3458830

Thomas Schmied, Diego Didona, Andreas Doring, Thomas Parnell, Nikolas Ioannou

{"title":"Towards a General Framework for ML-based Self-tuning Databases","authors":"Thomas Schmied, Diego Didona, Andreas Doring, Thomas Parnell, Nikolas Ioannou","doi":"10.1145/3437984.3458830","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) methods have recently emerged as an effective way to perform automated parameter tuning of databases. State-of-the-art approaches include Bayesian optimization (BO) and reinforcement learning (RL). In this work, we describe our experience when applying these methods to a database not yet studied in this context: FoundationDB. Firstly, we describe the challenges we faced, such as unknown valid ranges of configuration parameters and combinations of parameter values that result in invalid runs, and how we mitigated them. While these issues are typically overlooked, we argue that they are a crucial barrier to the adoption of ML self-tuning techniques in databases, and thus deserve more attention from the research community. Secondly, we present experimental results obtained when tuning FoundationDB using ML methods. Unlike prior work in this domain, we also compare with the simplest of baselines: random search. Our results show that, while BO and RL methods can improve the throughput of FoundationDB by up to 38%, random search is a highly competitive baseline, finding a configuration that is only 4% worse than the, vastly more complex, ML methods. We conclude that future work in this area may want to focus more on randomized, model-free optimization algorithms.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Workshop on Machine Learning and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3437984.3458830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Machine learning (ML) methods have recently emerged as an effective way to perform automated parameter tuning of databases. State-of-the-art approaches include Bayesian optimization (BO) and reinforcement learning (RL). In this work, we describe our experience when applying these methods to a database not yet studied in this context: FoundationDB. Firstly, we describe the challenges we faced, such as unknown valid ranges of configuration parameters and combinations of parameter values that result in invalid runs, and how we mitigated them. While these issues are typically overlooked, we argue that they are a crucial barrier to the adoption of ML self-tuning techniques in databases, and thus deserve more attention from the research community. Secondly, we present experimental results obtained when tuning FoundationDB using ML methods. Unlike prior work in this domain, we also compare with the simplest of baselines: random search. Our results show that, while BO and RL methods can improve the throughput of FoundationDB by up to 38%, random search is a highly competitive baseline, finding a configuration that is only 4% worse than the, vastly more complex, ML methods. We conclude that future work in this area may want to focus more on randomized, model-free optimization algorithms.

查看原文本刊更多论文

基于机器学习的自调优数据库通用框架研究

机器学习(ML)方法最近成为执行数据库自动参数调优的有效方法。最先进的方法包括贝叶斯优化(BO)和强化学习(RL)。在这项工作中，我们描述了将这些方法应用于一个尚未在此上下文中研究过的数据库时的经验:FoundationDB。首先，我们描述了我们面临的挑战，例如未知的有效配置参数范围和导致无效运行的参数值组合，以及我们如何减轻它们。虽然这些问题通常被忽视，但我们认为它们是在数据库中采用ML自调优技术的关键障碍，因此值得研究社区更多关注。其次，给出了使用ML方法调优FoundationDB的实验结果。与该领域的先前工作不同，我们还比较了最简单的基线:随机搜索。我们的结果表明，虽然BO和RL方法可以将FoundationDB的吞吐量提高38%，但随机搜索是一个非常有竞争力的基准，发现配置只比更复杂的ML方法差4%。我们得出结论，该领域的未来工作可能需要更多地关注随机化、无模型优化算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 1st Workshop on Machine Learning and Systems

自引率

0.00%

发文量