{"title":"Tutorial on Challenges for Big Data Application Performance Tuning and Prediction","authors":"Rekha Singhal","doi":"10.1145/2859889.2883587","DOIUrl":null,"url":null,"abstract":"Digitization of user services and cheap access to the internet has led to two critical problems- quick response to end-user queries and faster analysis of large accumulated data to serve users better. This has also led to the advent of various big data processing technologies, each of them has architecture specific parameters to tune for optimal execution of the application. There are also challenges in optimal scheduling of analytic queries for faster analysis, which lead to the problem of estimating analytic queries execution time for large data sizes on the production system. A production system may be an enterprise database system or a cluster of machines with Hadoop etc, where each machine may be of different hardware configuration (known as heterogeneous environment). In the first part of this tutorial, we shall present need and challenges for tuning big data applications on various platforms. This is followed by discussion on various existing solutions for application tuning. The second part of the tutorial presents the challenges and state of the art for estimating application execution time.","PeriodicalId":265808,"journal":{"name":"Companion Publication for ACM/SPEC on International Conference on Performance Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Publication for ACM/SPEC on International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2859889.2883587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Digitization of user services and cheap access to the internet has led to two critical problems- quick response to end-user queries and faster analysis of large accumulated data to serve users better. This has also led to the advent of various big data processing technologies, each of them has architecture specific parameters to tune for optimal execution of the application. There are also challenges in optimal scheduling of analytic queries for faster analysis, which lead to the problem of estimating analytic queries execution time for large data sizes on the production system. A production system may be an enterprise database system or a cluster of machines with Hadoop etc, where each machine may be of different hardware configuration (known as heterogeneous environment). In the first part of this tutorial, we shall present need and challenges for tuning big data applications on various platforms. This is followed by discussion on various existing solutions for application tuning. The second part of the tutorial presents the challenges and state of the art for estimating application execution time.