Ruijie Miao, Fenghao Dong, Yikai Zhao, Yiming Zhao, Yuhan Wu, Kaicheng Yang, Tong Yang, Bin Cui
{"title":"SketchConf:一个自动配置草图的框架","authors":"Ruijie Miao, Fenghao Dong, Yikai Zhao, Yiming Zhao, Yuhan Wu, Kaicheng Yang, Tong Yang, Bin Cui","doi":"10.1109/ICDE55515.2023.00157","DOIUrl":null,"url":null,"abstract":"Sketches have risen as promising solutions for frequency estimation, which is one of the most fundamental tasks in approximate data stream processing. In many scenarios, users have a strong demand to apply sketches under the expected error constraints. In this paper, we explore how to configure sketch parameters to satisfy user-defined error constraints. We propose SketchConf, an automatic sketch configuration framework, which efficiently generates memory-optimal configurations for the first time. We show that SketchConf can be applied to order-independent sketches, including CM, Count, Tower, and Nitro sketches. We further discuss how to deal with the unknown and changeable workloads when applying SketchConf to the real scenarios of streaming data processing. Experimental results show that SketchConf can be up to 715.51 times faster than the baseline algorithm, and the outputted configurations save up to 99.99% memory and achieve up to 27.44 times throughput, compared with the theory-based configurations. The code is open sourced at Github.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SketchConf: A Framework for Automatic Sketch Configuration\",\"authors\":\"Ruijie Miao, Fenghao Dong, Yikai Zhao, Yiming Zhao, Yuhan Wu, Kaicheng Yang, Tong Yang, Bin Cui\",\"doi\":\"10.1109/ICDE55515.2023.00157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sketches have risen as promising solutions for frequency estimation, which is one of the most fundamental tasks in approximate data stream processing. In many scenarios, users have a strong demand to apply sketches under the expected error constraints. In this paper, we explore how to configure sketch parameters to satisfy user-defined error constraints. We propose SketchConf, an automatic sketch configuration framework, which efficiently generates memory-optimal configurations for the first time. We show that SketchConf can be applied to order-independent sketches, including CM, Count, Tower, and Nitro sketches. We further discuss how to deal with the unknown and changeable workloads when applying SketchConf to the real scenarios of streaming data processing. Experimental results show that SketchConf can be up to 715.51 times faster than the baseline algorithm, and the outputted configurations save up to 99.99% memory and achieve up to 27.44 times throughput, compared with the theory-based configurations. The code is open sourced at Github.\",\"PeriodicalId\":434744,\"journal\":{\"name\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE55515.2023.00157\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SketchConf: A Framework for Automatic Sketch Configuration
Sketches have risen as promising solutions for frequency estimation, which is one of the most fundamental tasks in approximate data stream processing. In many scenarios, users have a strong demand to apply sketches under the expected error constraints. In this paper, we explore how to configure sketch parameters to satisfy user-defined error constraints. We propose SketchConf, an automatic sketch configuration framework, which efficiently generates memory-optimal configurations for the first time. We show that SketchConf can be applied to order-independent sketches, including CM, Count, Tower, and Nitro sketches. We further discuss how to deal with the unknown and changeable workloads when applying SketchConf to the real scenarios of streaming data processing. Experimental results show that SketchConf can be up to 715.51 times faster than the baseline algorithm, and the outputted configurations save up to 99.99% memory and achieve up to 27.44 times throughput, compared with the theory-based configurations. The code is open sourced at Github.