学习指标的定制回归:对数误差回归

Fourth Workshop in Exploiting AI Techniques for Data Management Pub Date : 2021-06-20 DOI:10.1145/3464509.3464891

Martin Eppert, Philipp Fent, Thomas Neumann

{"title":"学习指标的定制回归:对数误差回归","authors":"Martin Eppert, Philipp Fent, Thomas Neumann","doi":"10.1145/3464509.3464891","DOIUrl":null,"url":null,"abstract":"Although linear regressions are essential for learned index structures, most implementations use Simple Linear Regression, which optimizes the squared error. Since learned indexes use exponential search, regressions that optimize the logarithmic error are much better tailored for the use-case. By using this fitting optimization target, we can significantly improve learned index’s lookup performance with no architectural changes. While the log-error is harder to optimize, our novel algorithms and optimization heuristics can bring a practical performance improvement of the lookup latency. Even in cases where fast build times are paramount, log-error regressions still provide a robust fallback for degenerated leaf models. The resulting regressions are much better suited for learned indexes, and speed up lookups on data sets with outliers by over a factor of 2.","PeriodicalId":306522,"journal":{"name":"Fourth Workshop in Exploiting AI Techniques for Data Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Tailored Regression for Learned Indexes: Logarithmic Error Regression\",\"authors\":\"Martin Eppert, Philipp Fent, Thomas Neumann\",\"doi\":\"10.1145/3464509.3464891\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although linear regressions are essential for learned index structures, most implementations use Simple Linear Regression, which optimizes the squared error. Since learned indexes use exponential search, regressions that optimize the logarithmic error are much better tailored for the use-case. By using this fitting optimization target, we can significantly improve learned index’s lookup performance with no architectural changes. While the log-error is harder to optimize, our novel algorithms and optimization heuristics can bring a practical performance improvement of the lookup latency. Even in cases where fast build times are paramount, log-error regressions still provide a robust fallback for degenerated leaf models. The resulting regressions are much better suited for learned indexes, and speed up lookups on data sets with outliers by over a factor of 2.\",\"PeriodicalId\":306522,\"journal\":{\"name\":\"Fourth Workshop in Exploiting AI Techniques for Data Management\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fourth Workshop in Exploiting AI Techniques for Data Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3464509.3464891\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth Workshop in Exploiting AI Techniques for Data Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3464509.3464891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

虽然线性回归对于学习索引结构是必不可少的，但大多数实现使用简单线性回归，它优化了平方误差。由于学习索引使用指数搜索，因此优化对数误差的回归可以更好地为用例量身定制。通过使用这个拟合优化目标，我们可以在不改变体系结构的情况下显著提高学习索引的查找性能。虽然日志错误更难优化，但我们的新算法和优化启发式可以带来查找延迟的实际性能改进。即使在快速构建时间至关重要的情况下，对数错误回归仍然为退化的叶模型提供了一个健壮的回退。由此产生的回归更适合于学习索引，并且将查找具有异常值的数据集的速度提高了2倍以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Tailored Regression for Learned Indexes: Logarithmic Error Regression

Although linear regressions are essential for learned index structures, most implementations use Simple Linear Regression, which optimizes the squared error. Since learned indexes use exponential search, regressions that optimize the logarithmic error are much better tailored for the use-case. By using this fitting optimization target, we can significantly improve learned index’s lookup performance with no architectural changes. While the log-error is harder to optimize, our novel algorithms and optimization heuristics can bring a practical performance improvement of the lookup latency. Even in cases where fast build times are paramount, log-error regressions still provide a robust fallback for degenerated leaf models. The resulting regressions are much better suited for learned indexes, and speed up lookups on data sets with outliers by over a factor of 2.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Fourth Workshop in Exploiting AI Techniques for Data Management

自引率

0.00%

发文量