Petr Kubát, L. Bulej, T. Bures, Vojtech Horký, P. Tůma
{"title":"Adaptive Dispatch: A Pattern for Performance-Aware Software Self-Adaptation","authors":"Petr Kubát, L. Bulej, T. Bures, Vojtech Horký, P. Tůma","doi":"10.1145/3185768.3186406","DOIUrl":"https://doi.org/10.1145/3185768.3186406","url":null,"abstract":"Modern software systems often employ dynamic adaptation to runtime conditions in some parts of their functionality -- well known examples range from autotuning of computing kernels through adaptive battery saving strategies of mobile applications to dynamic load balancing and failover functionality in computing clouds. Typically, the implementation of these features is problem-specific -- a particular autotuner, a particular load balancer, and so on -- and enjoys little support from the implementation environment beyond standard programming constructs. In this work, we propose Adaptive Dispatch as a generic coding pattern for implementing dynamic adaptation. We believe that such pattern can make the implementation of dynamic adaptation features better in multiple aspects -- an explicit adaptation construct makes the presence of adaptation easily visible to programmers, lends itself to manipulation with development tools, and facilitates coordination of adaptation behavior at runtime. We present an implementation of the Adaptive Dispatch pattern as an internal DSL in Scala.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"109 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89956190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victor H. F. Oliveira, Alex F. A. Furtunato, L. Silveira, Kyriakos Georgiou, K. Eder, S. X. D. Souza
{"title":"Application Speedup Characterization: Modeling Parallelization Overhead and Variations of Problem Size and Number of Cores.","authors":"Victor H. F. Oliveira, Alex F. A. Furtunato, L. Silveira, Kyriakos Georgiou, K. Eder, S. X. D. Souza","doi":"10.1145/3185768.3185770","DOIUrl":"https://doi.org/10.1145/3185768.3185770","url":null,"abstract":"To make efficient use of multi-core processors, it is important to understand the performance behavior of parallel applications. Modeling this can enable the use of online approaches to optimize throughput or energy, or even guarantee a minimum QoS. Accurate models would avoid probe different runtime configurations, which causes overhead. Throughout the years, many speedup models were proposed. Most of them based on Amdahl's or Gustafson's laws. However, many of those make considerations such as a fixed parallel fraction, or a parallel fraction that varies linearly with problem size, and inexistent parallelization overhead. Although such models aid in the theoretical understanding, these considerations do not hold in real environments, which makes the modeling unsuitable for accurate characterization of parallel applications. The model proposed estimates the speedup taking into account the variation of its parallel fraction according to problem size, number of cores used and overhead. Using four applications from the PARSEC benchmark suite, the proposed model was able to estimate speedups more accurately than other models in recent literature.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78312116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gunnar Brataas, B. Neumayr, C. G. Schütz, A. Vennesland
{"title":"Towards Scalability Guidelines for Semantic Data Container Management","authors":"Gunnar Brataas, B. Neumayr, C. G. Schütz, A. Vennesland","doi":"10.1145/3185768.3186302","DOIUrl":"https://doi.org/10.1145/3185768.3186302","url":null,"abstract":"Semantic container management is a promising approach to organize data. However, the scalability of this approach is challenging. By scalability in this paper, we mean the expressivity and size of the semantic data containers we can handle, given a suitable quality threshold. In this paper, we derive scalability characteristics of the semantic container approach in a structured way. We also describe actual experiments where we vary the number of available CPU cores and quality thresholds. We conclude this work-in-progress paper by describing how more measurements could be performed so that the missing guidelines could be provided.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90681842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SPARK Job Performance Analysis and Prediction Tool","authors":"Rekha Singhal, Chetan Phalak, P. Singh","doi":"10.1145/3185768.3185772","DOIUrl":"https://doi.org/10.1145/3185768.3185772","url":null,"abstract":"Spark is one of most widely deployed in-memory big data technology for parallel data processing across cluster of machines. The availability of these big data platforms on commodity machines has raised the challenge of assuring performance of applications with increase in data size. We have build a tool to assist application developer and tester to estimate an application execution time for larger data size before deployment. Conversely, the tool may also be used to estimate the competent cluster size for desired application performance in production environment. The tool can be used for detailed profiling of Spark job, post execution, to understand performance bottleneck. This tool incorporates different configurations of Spark cluster to estimate application performance. Therefore, it can also be used with optimization techniques to get tuned value of Spark parameters for an optimal performance. The tool's key innovations are support for different configurations of Spark platform for performance prediction and simulator to estimate Spark stage execution time which includes task execution variability due to HDFS, data skew and cluster nodes heterogeneity. The tool using model [3] has been shown to predict within 20% error bound for Wordcount, Terasort,Kmeans and few SQL workloads.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"80 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80676391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Performance Study of Big Data Workloads in Cloud Datacenters with Network Variability","authors":"Alexandru Uta, Harry Obaseki","doi":"10.1145/3185768.3186299","DOIUrl":"https://doi.org/10.1145/3185768.3186299","url":null,"abstract":"Public cloud computing platforms are a cost-effective solution for individuals and organizations to deploy various types of workloads, ranging from scientific applications, business-critical workloads, e-governance to big data applications. Co-locating all such different types of workloads in a single datacenter leads not only to performance degradation, but also to large degrees of performance variability, which is the result of virtualization, resource sharing and congestion. Many studies have already assessed and characterized the degree of resource variability in public clouds. However, we are missing a clear picture on how resource variability impacts big data workloads. In this work, we take a step towards characterizing the behavior of big data workloads under network bandwidth variability. Emulating real-world clouds» bandwidth distribution, we characterize the performance achieved by running real-world big data applications. We find that most big data workloads are slowed down under network variability scenarios, even those that are not network-bound. Moreover, the maximum average slowdown for the cloud setup with highest variability is 1.48 for CPU-bound workloads, and 1.79 for network-bound workloads.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74942011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Björn F. Postema, T. V. Damme, C. D. Persis, P. Tesi, B. Haverkort
{"title":"Combining Energy Saving Techniques in Data Centres using Model-Based Analysis","authors":"Björn F. Postema, T. V. Damme, C. D. Persis, P. Tesi, B. Haverkort","doi":"10.1145/3185768.3186310","DOIUrl":"https://doi.org/10.1145/3185768.3186310","url":null,"abstract":"Advanced power management and cooling techniques for data centres often co-exist as separate entities in current-day operation of data centres. This paper proposes to combine these techniques to achieve greater power savings. To this end, an existing theoretical thermal-aware model is integrated in an extensive simulation framework for data centres using power and performance models, which allows for a detailed study in power, performance and thermal metrics. The paper compares four distinct cases for studying the effect on these metrics: a data centre with (i) basic functionality; (ii) advanced cooling; (iii) advanced power management; and (iv) a combination thereof. The combined case shows a significant reduction in the energy consumption compared to the other cases while performance and thermal demands are kept intact. The combination of these techniques shows improvements in energy savings and shows it is meaningful to investigate further into smart combined energy saving techniques.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84310537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploratory Analysis of Spark Structured Streaming","authors":"Todor Ivanov, Jason Taafe","doi":"10.1145/3185768.3186360","DOIUrl":"https://doi.org/10.1145/3185768.3186360","url":null,"abstract":"In the Big Data era, stream processing has become a common requirement for many data-intensive applications. This has lead to many advances in the development and adaption of large scale streaming systems. Spark and Flink have become a popular choice for many developers as they combine both batch and streaming capabilities in a single system. However, introducing the Spark Structured Streaming in version 2.0 opened up completely new features for SparkSQL, which are alternatively only available in Apache Calcite. This work focuses on the new Spark Structured Streaming and analyses it by diving into its internal functionalities. With the help of a micro-benchmark consisting of streaming queries, we perform initial experiments evaluating the technology. Our results show that Spark Structured Streaming is able to run multiple queries successfully in parallel on data with changing velocity and volume sizes.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"35 16","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91439008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Workload-Dependent Performance Analysis of an In-Memory Database in a Multi-Tenant Configuration","authors":"Dominik Paluch, Harald Kienegger, H. Krcmar","doi":"10.1145/3185768.3186290","DOIUrl":"https://doi.org/10.1145/3185768.3186290","url":null,"abstract":"Modern in-memory database systems begin to provide multi-tenancy features. In contrast to the traditional operation of one large database appliance per system, the utilization of the multi-tenancy features allows for multiple database containers running on one system. Consequently, the database tenants share the same system resources, which has an influence on their performance. Understanding the performance of database tenants in different setups with varying workloads is a challenging task. However, knowledge of the performance behavior is crucial in order to benefit from multi-tenancy. In this paper, we provide fine-grained performance insights of the in-memory database SAP HANA in a multi-tenant configuration. We perform multiple benchmark runs utilizing an online analytical processing benchmark in order to retrieve information about the performance behavior of the multi-tenant database containers. Furthermore, we provide an analysis of the collected results and show a more efficient usage of threads in an environment with less active tenants under specific workload conditions.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"239 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89756283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Prediction for Families of Data-Intensive Software Applications","authors":"J. Verriet, R. Dankers, L. Somers","doi":"10.1145/3185768.3186405","DOIUrl":"https://doi.org/10.1145/3185768.3186405","url":null,"abstract":"Performance is a critical system property of any system, in particular of data-intensive systems, such as image processing systems. We describe a performance engineering method for families of data-intensive systems that is both simple and accurate; the performance of new family members is predicted using models of existing family members. The predictive models are calibrated using static code analysis and regression. Code analysis is used to extract performance profiles, which are used in combination with regression to derive predictive performance models. A case study presents the application for an industrial image processing case, which revealed as benefits the easy application and identification of code performance optimization points.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"98 5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82238611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Analysis of Workflow Formalisms for Workflows with Complex Non-Functional Requirements","authors":"L. Versluis, Erwin Van Eyk, A. Iosup","doi":"10.1145/3185768.3186297","DOIUrl":"https://doi.org/10.1145/3185768.3186297","url":null,"abstract":"Cloud and datacenter operators offer progressively more sophisticated service level agreements to customers. The Quality-of-Service guarantees by these operators have started to entail non-functional requirements customers have regarding their applications. At the same time, expressing applications as workflows in datacenters is increasingly more common. Currently, non-functional requirements (NFRs) can only be defined on entire workflows and cannot be changed at runtime, possibly wasting valuable resources. To move towards modifiable NFRs at the task level, there is a need for a formalism capable of expressing this. Existing formalisms do not support this level of granularity or are restricted to a subset of NFRs. In this work, we investigate the current support for NFRs in existing formalisms. Using a library containing workflows with and without NFRs, we inspect the capability of existing formalisms to express these requirements. Additionally, we create and evaluate five metrics to qualitatively and quantitatively compare each formalism. Our main findings are that although current formalisms do not support arbitrary NFRs per-task, the Directed Acyclic Graphs (DAGs) formalism is the most suitable to extend.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75523326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}