Can Uncertainty Quantification Improve Learned Index Benefit Estimation?

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-08-04 DOI:10.1109/TKDE.2025.3591237

Tao Yu;Zhaonian Zou;Hao Xiong

{"title":"Can Uncertainty Quantification Improve Learned Index Benefit Estimation?","authors":"Tao Yu;Zhaonian Zou;Hao Xiong","doi":"10.1109/TKDE.2025.3591237","DOIUrl":null,"url":null,"abstract":"Index tuning is crucial for optimizing database performance by selecting optimal indexes based on workload. The key to this process lies in an accurate and efficient benefit estimator. Traditional methods relying on what-if tools often suffer from inefficiency and inaccuracy. In contrast, learning-based models provide a promising alternative but face challenges such as instability, lack of interpretability, and complex management. To overcome these limitations, we adopt a novel approach: quantifying the uncertainty in learning-based models’ results, thereby combining the strengths of both traditional and learning-based methods for reliable index tuning. We propose <sc>Beauty</small>, the first uncertainty-aware framework that enhances learning-based models with uncertainty quantification and uses what-if tools as a complementary mechanism to improve reliability and reduce management complexity. Specifically, we introduce a novel method that combines AutoEncoder and Monte Carlo Dropout to jointly quantify uncertainty, tailored to the characteristics of benefit estimation tasks. In experiments involving sixteen models, our approach outperformed existing uncertainty quantification methods in the majority of cases. We also conducted index tuning tests on six datasets. By applying the <sc>Beauty</small> framework, we eliminated worst-case scenarios and more than tripled the occurrence of best-case scenarios.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 10","pages":"5823-5837"},"PeriodicalIF":10.4000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11111721/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Index tuning is crucial for optimizing database performance by selecting optimal indexes based on workload. The key to this process lies in an accurate and efficient benefit estimator. Traditional methods relying on what-if tools often suffer from inefficiency and inaccuracy. In contrast, learning-based models provide a promising alternative but face challenges such as instability, lack of interpretability, and complex management. To overcome these limitations, we adopt a novel approach: quantifying the uncertainty in learning-based models’ results, thereby combining the strengths of both traditional and learning-based methods for reliable index tuning. We propose Beauty, the first uncertainty-aware framework that enhances learning-based models with uncertainty quantification and uses what-if tools as a complementary mechanism to improve reliability and reduce management complexity. Specifically, we introduce a novel method that combines AutoEncoder and Monte Carlo Dropout to jointly quantify uncertainty, tailored to the characteristics of benefit estimation tasks. In experiments involving sixteen models, our approach outperformed existing uncertainty quantification methods in the majority of cases. We also conducted index tuning tests on six datasets. By applying the Beauty framework, we eliminated worst-case scenarios and more than tripled the occurrence of best-case scenarios.

查看原文本刊更多论文

不确定性量化能改善学习指标效益估计吗？

索引调优对于根据工作负载选择最优索引来优化数据库性能至关重要。这个过程的关键在于一个准确有效的效益评估器。依赖于假设工具的传统方法常常存在效率低下和不准确的问题。相比之下，基于学习的模型提供了一个有希望的替代方案，但面临着不稳定、缺乏可解释性和复杂管理等挑战。为了克服这些限制，我们采用了一种新颖的方法：量化基于学习的模型结果中的不确定性，从而结合传统方法和基于学习的方法的优势来进行可靠的索引调优。我们提出了第一个不确定性感知框架Beauty，它通过不确定性量化来增强基于学习的模型，并使用假设工具作为补充机制来提高可靠性和降低管理复杂性。具体来说，我们针对效益估计任务的特点，引入了一种结合AutoEncoder和Monte Carlo Dropout的新方法来共同量化不确定性。在涉及16个模型的实验中，我们的方法在大多数情况下优于现有的不确定性量化方法。我们还对六个数据集进行了索引调优测试。通过应用Beauty框架，我们消除了最坏情况，并将最佳情况的出现次数增加了三倍以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.