Smarter Warehouse

2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW) Pub Date : 2022-05-01 DOI:10.1109/icdew55742.2022.00005

N. Laptev, Wenbo Tao, C. Komurlu, Jason Xu, Deke Sun, T. Lux, Luo Mi

引用次数: 0

Abstract

Warehouse users often have to make too many decisions about their queries, pipelines, workflows and data to optimize the resources they use as well as the quality and the availability of their data. For example, whether to use Spark or Presto, how to best partition their data or what hyper-parameters to tune to resolve various query or pipeline problems. Furthermore, warehouse users are often unaware of big performance opportunities around data skew, multi-query optimization, query materialization and more. In this paper we describe the Smarter Warehouse initiative that aims to automate or simplify many of these optimization decisions. Our long term vision is for a large portion of the Smarter Warehouse optimizations to be seamlessly incorporated into the compute and I/O layers of the stack, leading to a simpler warehouse user experience and large amounts of resource savings.

查看原文本刊更多论文

智能仓库

仓库用户经常需要对他们的查询、管道、工作流和数据做出太多的决定，以优化他们使用的资源以及数据的质量和可用性。例如，是使用Spark还是Presto，如何对数据进行最佳分区，或者调优哪些超参数来解决各种查询或管道问题。此外，仓库用户通常没有意识到数据倾斜、多查询优化、查询物化等方面的巨大性能机会。在本文中，我们描述了旨在自动化或简化这些优化决策的Smarter Warehouse计划。我们的长期愿景是将智能仓库优化的很大一部分无缝地集成到堆栈的计算和I/O层中，从而实现更简单的仓库用户体验并节省大量资源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW)

自引率

0.00%

发文量