Tutorial on Challenges for Big Data Application Performance Tuning and Prediction

Companion Publication for ACM/SPEC on International Conference on Performance Engineering Pub Date : 2016-03-12 DOI:10.1145/2859889.2883587

Rekha Singhal

{"title":"Tutorial on Challenges for Big Data Application Performance Tuning and Prediction","authors":"Rekha Singhal","doi":"10.1145/2859889.2883587","DOIUrl":null,"url":null,"abstract":"Digitization of user services and cheap access to the internet has led to two critical problems- quick response to end-user queries and faster analysis of large accumulated data to serve users better. This has also led to the advent of various big data processing technologies, each of them has architecture specific parameters to tune for optimal execution of the application. There are also challenges in optimal scheduling of analytic queries for faster analysis, which lead to the problem of estimating analytic queries execution time for large data sizes on the production system. A production system may be an enterprise database system or a cluster of machines with Hadoop etc, where each machine may be of different hardware configuration (known as heterogeneous environment). In the first part of this tutorial, we shall present need and challenges for tuning big data applications on various platforms. This is followed by discussion on various existing solutions for application tuning. The second part of the tutorial presents the challenges and state of the art for estimating application execution time.","PeriodicalId":265808,"journal":{"name":"Companion Publication for ACM/SPEC on International Conference on Performance Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Publication for ACM/SPEC on International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2859889.2883587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Digitization of user services and cheap access to the internet has led to two critical problems- quick response to end-user queries and faster analysis of large accumulated data to serve users better. This has also led to the advent of various big data processing technologies, each of them has architecture specific parameters to tune for optimal execution of the application. There are also challenges in optimal scheduling of analytic queries for faster analysis, which lead to the problem of estimating analytic queries execution time for large data sizes on the production system. A production system may be an enterprise database system or a cluster of machines with Hadoop etc, where each machine may be of different hardware configuration (known as heterogeneous environment). In the first part of this tutorial, we shall present need and challenges for tuning big data applications on various platforms. This is followed by discussion on various existing solutions for application tuning. The second part of the tutorial presents the challenges and state of the art for estimating application execution time.

查看原文本刊更多论文

大数据应用性能调优和预测的挑战教程

用户服务的数字化和互联网的廉价接入导致了两个关键问题——对最终用户查询的快速响应和对大量累积数据的更快分析，以更好地为用户服务。这也导致了各种大数据处理技术的出现，每种技术都有特定于架构的参数，以优化应用程序的执行。在优化分析查询的调度以实现更快的分析方面也存在挑战，这导致了在生产系统上估计大数据量的分析查询执行时间的问题。一个生产系统可能是一个企业数据库系统，也可能是一个带有Hadoop等的机器集群，其中每台机器可能具有不同的硬件配置(称为异构环境)。在本教程的第一部分中，我们将介绍在各种平台上调优大数据应用程序的需求和挑战。然后讨论各种现有的应用程序调优解决方案。本教程的第二部分介绍了估算应用程序执行时间的挑战和最新进展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Companion Publication for ACM/SPEC on International Conference on Performance Engineering

自引率

0.00%

发文量