Yingze Li, Xianglong Liu, Hongzhi Wang, Kaixin Zhang, Zixuan Wang
{"title":"具有有界 Q 误差的可更新数据驱动卡方估计器","authors":"Yingze Li, Xianglong Liu, Hongzhi Wang, Kaixin Zhang, Zixuan Wang","doi":"arxiv-2408.17209","DOIUrl":null,"url":null,"abstract":"Modern Cardinality Estimators struggle with data updates. This research\ntackles this challenge within single-table. We introduce ICE, an Index-based\nCardinality Estimator, the first data-driven estimator that enables instant,\ntuple-leveled updates. ICE has learned two key lessons from the multidimensional index and applied\nthem to solve cardinality estimation in dynamic scenarios: (1) Index possesses\nthe capability for swift training and seamless updating amidst vast\nmultidimensional data. (2) Index offers precise data distribution, staying\nsynchronized with the latest database version. These insights endow the index\nwith the ability to be a highly accurate, data-driven model that rapidly adapts\nto data updates and is resilient to out-of-distribution challenges during query\ntesting. To make a solitary index support cardinality estimation, we have\ncrafted sophisticated algorithms for training, updating, and estimating,\nanalyzing unbiasedness and variance. Extensive experiments demonstrate the superiority of ICE. ICE offers precise\nestimations and fast updates/construction across diverse workloads. Compared to\nstate-of-the-art real-time query-driven models, ICE boasts superior accuracy\n(2-3 orders of magnitude more precise), faster updates (4.7-6.9 times faster),\nand significantly reduced training time (up to 1-3 orders of magnitude faster).","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Updateable Data-Driven Cardinality Estimator with Bounded Q-error\",\"authors\":\"Yingze Li, Xianglong Liu, Hongzhi Wang, Kaixin Zhang, Zixuan Wang\",\"doi\":\"arxiv-2408.17209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern Cardinality Estimators struggle with data updates. This research\\ntackles this challenge within single-table. We introduce ICE, an Index-based\\nCardinality Estimator, the first data-driven estimator that enables instant,\\ntuple-leveled updates. ICE has learned two key lessons from the multidimensional index and applied\\nthem to solve cardinality estimation in dynamic scenarios: (1) Index possesses\\nthe capability for swift training and seamless updating amidst vast\\nmultidimensional data. (2) Index offers precise data distribution, staying\\nsynchronized with the latest database version. These insights endow the index\\nwith the ability to be a highly accurate, data-driven model that rapidly adapts\\nto data updates and is resilient to out-of-distribution challenges during query\\ntesting. To make a solitary index support cardinality estimation, we have\\ncrafted sophisticated algorithms for training, updating, and estimating,\\nanalyzing unbiasedness and variance. Extensive experiments demonstrate the superiority of ICE. ICE offers precise\\nestimations and fast updates/construction across diverse workloads. Compared to\\nstate-of-the-art real-time query-driven models, ICE boasts superior accuracy\\n(2-3 orders of magnitude more precise), faster updates (4.7-6.9 times faster),\\nand significantly reduced training time (up to 1-3 orders of magnitude faster).\",\"PeriodicalId\":501123,\"journal\":{\"name\":\"arXiv - CS - Databases\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Databases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.17209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.17209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Updateable Data-Driven Cardinality Estimator with Bounded Q-error
Modern Cardinality Estimators struggle with data updates. This research
tackles this challenge within single-table. We introduce ICE, an Index-based
Cardinality Estimator, the first data-driven estimator that enables instant,
tuple-leveled updates. ICE has learned two key lessons from the multidimensional index and applied
them to solve cardinality estimation in dynamic scenarios: (1) Index possesses
the capability for swift training and seamless updating amidst vast
multidimensional data. (2) Index offers precise data distribution, staying
synchronized with the latest database version. These insights endow the index
with the ability to be a highly accurate, data-driven model that rapidly adapts
to data updates and is resilient to out-of-distribution challenges during query
testing. To make a solitary index support cardinality estimation, we have
crafted sophisticated algorithms for training, updating, and estimating,
analyzing unbiasedness and variance. Extensive experiments demonstrate the superiority of ICE. ICE offers precise
estimations and fast updates/construction across diverse workloads. Compared to
state-of-the-art real-time query-driven models, ICE boasts superior accuracy
(2-3 orders of magnitude more precise), faster updates (4.7-6.9 times faster),
and significantly reduced training time (up to 1-3 orders of magnitude faster).