Hani Nemati, S. V. Azhari, Mahsa Shakeri, M. Dagenais
{"title":"Host-Based Virtual Machine Workload Characterization Using Hypervisor Trace Mining","authors":"Hani Nemati, S. V. Azhari, Mahsa Shakeri, M. Dagenais","doi":"10.1145/3460197","DOIUrl":"https://doi.org/10.1145/3460197","url":null,"abstract":"Cloud computing is a fast-growing technology that provides on-demand access to a pool of shared resources. This type of distributed and complex environment requires advanced resource management solutions that could model virtual machine (VM) behavior. Different workload measurements, such as CPU, memory, disk, and network usage, are usually derived from each VM to model resource utilization and group similar VMs. However, these course workload metrics require internal access to each VM with the available performance analysis toolkit, which is not feasible with many cloud environments privacy policies. In this article, we propose a non-intrusive host-based virtual machine workload characterization using hypervisor tracing. VM blockings duration, along with virtual interrupt injection rates, are derived as features to reveal multiple levels of resource intensiveness. In addition, the VM exit reason is considered, as well as the resource contention rate due to the host and other VMs. Moreover, the processes and threads preemption rates in each VM are extracted using the collected tracing logs. Our proposed approach further improves the selected features by exploiting a page ranking based algorithm to filter non-important processes running on each VM. Once the metric features are defined, a two-stage VM clustering technique is employed to perform both coarse- and fine-grain workload characterization. The inter-cluster and intra-cluster similarity metrics of the silhouette score is used to reveal distinct VM workload groups, as well as the ones with significant overlap. The proposed framework can provide a detailed vision of the underlying behavior of the running VMs. This can assist infrastructure administrators in efficient resource management, as well as root cause analysis.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129819601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guilherme de Melo Baptista Domingues, Gabriel Mendonça, E. D. S. E. Silva, R. M. Leão, D. Menasché, Ori Rottenstreich, Mostafa Dehghan, D. Towsley
{"title":"The Role of Hysteresis in Caching Systems","authors":"Guilherme de Melo Baptista Domingues, Gabriel Mendonça, E. D. S. E. Silva, R. M. Leão, D. Menasché, Ori Rottenstreich, Mostafa Dehghan, D. Towsley","doi":"10.1145/3450564","DOIUrl":"https://doi.org/10.1145/3450564","url":null,"abstract":"Caching is a fundamental element of networking systems since the early days of the Internet. By filtering requests toward custodians, caches reduce the bandwidth required by the latter and the delay experienced by clients. The requests that are not served by a cache, in turn, comprise its miss stream. We refer to the dependence of the cache state and miss stream on its history as hysteresis. Although hysteresis is at the core of caching systems, a dimension that has not been systematically studied in previous works relates to its impact on caching systems between misses, evictions, and insertions. In this article, we propose novel mechanisms and models to leverage hysteresis on cache evictions and insertions. The proposed solutions extend TTL-like mechanisms and rely on two knobs to tune the time between insertions and evictions given a target hit rate. We show the general benefits of hysteresis and the particular improvement of the two thresholds strategy in reducing download times, making the system more predictable and accounting for different costs associated with object retrieval.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131226327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hosein Mohammadi Makrani, H. Sayadi, Najmeh Nazari, Sai Manoj Pudukotai Dinakarrao, Avesta Sasan, T. Mohsenin, S. Rafatirad, H. Homayoun
{"title":"Adaptive Performance Modeling of Data-intensive Workloads for Resource Provisioning in Virtualized Environment","authors":"Hosein Mohammadi Makrani, H. Sayadi, Najmeh Nazari, Sai Manoj Pudukotai Dinakarrao, Avesta Sasan, T. Mohsenin, S. Rafatirad, H. Homayoun","doi":"10.1145/3442696","DOIUrl":"https://doi.org/10.1145/3442696","url":null,"abstract":"The processing of data-intensive workloads is a challenging and time-consuming task that often requires massive infrastructure to ensure fast data analysis. The cloud platform is the most popular and powerful scale-out infrastructure to perform big data analytics and eliminate the need to maintain expensive and high-end computing resources at the user side. The performance and the cost of such infrastructure depend on the overall server configuration, such as processor, memory, network, and storage configurations. In addition to the cost of owning or maintaining the hardware, the heterogeneity in the server configuration further expands the selection space, leading to non-convergence. The challenge is further exacerbated by the dependency of the application’s performance on the underlying hardware. Despite an increasing interest in resource provisioning, few works have been done to develop accurate and practical models to proactively predict the performance of data-intensive applications corresponding to the server configuration and provision a cost-optimal configuration online. In this work, through a comprehensive real-system empirical analysis of performance, we address these challenges by introducing ProMLB: a proactive machine-learning-based methodology for resource provisioning. We first characterize diverse types of data-intensive workloads across different types of server architectures. The characterization aids in accurately capture applications’ behavior and train a model for prediction of their performance. Then, ProMLB builds a set of cross-platform performance models for each application. Based on the developed predictive model, ProMLB uses an optimization technique to distinguish close-to-optimal configuration to minimize the product of execution time and cost. Compared to the oracle scheduler, ProMLB achieves 91% accuracy in terms of application-resource matching. On average, ProMLB improves the performance and resource utilization by 42.6% and 41.1%, respectively, compared to baseline scheduler. Moreover, ProMLB improves the performance per cost by 2.5× on average.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131761936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Thread and Data Mapping Using a Sharing-Aware Memory Management Unit","authors":"E. Cruz, M. Diener, L. Pilla, P. Navaux","doi":"10.1145/3433687","DOIUrl":"https://doi.org/10.1145/3433687","url":null,"abstract":"Current and future architectures rely on thread-level parallelism to sustain performance growth. These architectures have introduced a complex memory hierarchy, consisting of several cores organized hierarchically with multiple cache levels and NUMA nodes. These memory hierarchies can have an impact on the performance and energy efficiency of parallel applications as the importance of memory access locality is increased. In order to improve locality, the analysis of the memory access behavior of parallel applications is critical for mapping threads and data. Nevertheless, most previous work relies on indirect information about the memory accesses, or does not combine thread and data mapping, resulting in less accurate mappings. In this paper, we propose the Sharing-Aware Memory Management Unit (SAMMU), an extension to the memory management unit that allows it to detect the memory access behavior in hardware. With this information, the operating system can perform online mapping without any previous knowledge about the behavior of the application. In the evaluation with a wide range of parallel applications (NAS Parallel Benchmarks and PARSEC Benchmark Suite), performance was improved by up to 35.7% (10.0% on average) and energy efficiency was improved by up to 11.9% (4.1% on average). These improvements happened due to a substantial reduction of cache misses and interconnection traffic.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123980882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Online Algorithms for File-Bundle Caching and Generalization to Distributed Caching","authors":"Tiancheng Qin, S. Etesami","doi":"10.1145/3445028","DOIUrl":"https://doi.org/10.1145/3445028","url":null,"abstract":"We consider a generalization of the standard cache problem called file-bundle caching, where different queries (tasks), each containing l ≥ 1 files, sequentially arrive. An online algorithm that does not know the sequence of queries ahead of time must adaptively decide on what files to keep in the cache to incur the minimum number of cache misses. Here a cache miss refers to the case where at least one file in a query is missing among the cache files. In the special case where l = 1, this problem reduces to the standard cache problem. We first analyze the performance of the classic least recently used (LRU) algorithm in this setting and show that LRU is a near-optimal online deterministic algorithm for file-bundle caching with regard to competitive ratio. We then extend our results to a generalized (h,k)-paging problem in this file-bundle setting, where the performance of the online algorithm with a cache size k is compared to an optimal offline benchmark of a smaller cache size h < k. In this latter case, we provide a randomized O(l ln k/k-h)-competitive algorithm for our generalized (h, k)-paging problem, which can be viewed as an extension of the classic marking algorithm. We complete this result by providing a matching lower bound for the competitive ratio, indicating that the performance of this modified marking algorithm is within a factor of 2 of any randomized online algorithm. Finally, we look at the distributed version of the file-bundle caching problem where there are m ≥ 1 identical caches in the system. In this case, we show that for m = l + 1 caches, there is a deterministic distributed caching algorithm that is (l2 + l)-competitive and a randomized distributed caching algorithm that is O(l ln (2l + 1)-competitive when l ≥ 2. We also provide a general framework to devise other efficient algorithms for the distributed file-bundle caching problem and evaluate the performance of our results through simulations.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125840437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhengchun Liu, R. Kettimuthu, Joaquín Chung, R. Ananthakrishnan, M. Link, Ian T Foster
{"title":"Design and Evaluation of a Simple Data Interface for Efficient Data Transfer across Diverse Storage","authors":"Zhengchun Liu, R. Kettimuthu, Joaquín Chung, R. Ananthakrishnan, M. Link, Ian T Foster","doi":"10.1145/3452007","DOIUrl":"https://doi.org/10.1145/3452007","url":null,"abstract":"Modern science and engineering computing environments often feature storage systems of different types, from parallel file systems in high-performance computing centers to object stores operated by cloud providers. To enable easy, reliable, secure, and performant data exchange among these different systems, we propose Connector, a plug-able data access architecture for diverse, distributed storage. By abstracting low-level storage system details, this abstraction permits a managed data transfer service (Globus, in our case) to interact with a large and easily extended set of storage systems. Equally important, it supports third-party transfers: that is, direct data transfers from source to destination that are initiated by a third-party client but do not engage that third party in the data path. The abstraction also enables management of transfers for performance optimization, error handling, and end-to-end integrity. We present the Connector design, describe implementations for different storage services, evaluate tradeoffs inherent in managed vs. direct transfers, motivate recommended deployment options, and propose a model-based method that allows for easy characterization of performance in different contexts without exhaustive benchmarking.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130330080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Plakia, Evripides Tzamousis, Thomais Asvestopoulou, Giorgos Pantermakis, Nick Filippakis, H. Schulzrinne, Yana Kane-Esrig, M. Papadopouli
{"title":"Should I Stay or Should I Go","authors":"Maria Plakia, Evripides Tzamousis, Thomais Asvestopoulou, Giorgos Pantermakis, Nick Filippakis, H. Schulzrinne, Yana Kane-Esrig, M. Papadopouli","doi":"10.1145/3377873","DOIUrl":"https://doi.org/10.1145/3377873","url":null,"abstract":"To improve the user engagement, especially under moderate to high traffic demand, it is important to understand the impact of the network and application QoS on user experience. This article comparatively evaluates the impact of impairments, with emphasis on rebufferings, startup delay, and bitrate changes, and their intensity and temporal dynamics, on user engagement in the context of video streaming. The analysis employed two large YouTube datasets. To characterize the user engagement and the impact of impairments, several new metrics were defined. We assessed whether or not there is a statistically significant relationship between different types of impairments and user engagement metrics, taking into account not only the characteristics of the impairments but also the covariates of the session (e.g., video duration, mean data rate). After observing the relationships across the entire dataset, we tested whether these relationships also persist under specific conditions with respect to the covariates. The introduction of several new metrics and of various covariates in the analysis are two innovative aspects of this work. We found that the presence of negative bitrate changes (BR-) is a stronger predictor of abandonment than rebufferrings (RB). Positive bitrate changes (BR+) in low resolution sessions are not well received. High rebufferring ratio has a prominent impact on the video watching percentage. These results can be used to guide the video streaming adaptation as well as suggest which parameters should be varied in controlled field studies.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125580070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Some Parameterized Dynamic Priority Policies for Two-Class M/G/1 Queues","authors":"Manu K. Gupta, N. Hemachandra, J. Venkateswaran","doi":"10.1145/3384390","DOIUrl":"https://doi.org/10.1145/3384390","url":null,"abstract":"Completeness of a dynamic priority scheduling scheme is of fundamental importance for the optimal control of queues in areas as diverse as computer communications, communication networks, supply/value chains, and manufacturing systems. Our first main contribution is to identify the mean waiting time completeness as a unifying aspect for four different dynamic priority scheduling schemes by proving their completeness and equivalence in two-class M/G/1 queues. These dynamic priority schemes are earliest due date based, head of line priority jump, relative priority, and probabilistic priority. We discuss major challenges in extending our results to three or more classes. In our second main contribution, we characterize the optimal scheduling policies for the case studies in different domains by exploiting the completeness of the above dynamic priority schemes. The major theme of the second main contribution is resource allocation/optimal control in revenue management problems for contemporary systems such as cloud computing, high performance computing, ans so forth, where congestion is inherent. Using completeness and the theoretically tractable nature of relative priority policy, we study the impact of approximation in a fairly generic data network utility framework. Next, we simplify a complex joint pricing and scheduling problem for a wider class of scheduling policies.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127577725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Vassio, M. Garetto, C. Chiasserini, Emilio Leonardi
{"title":"User Interaction with Online Advertisements","authors":"L. Vassio, M. Garetto, C. Chiasserini, Emilio Leonardi","doi":"10.1145/3377144","DOIUrl":"https://doi.org/10.1145/3377144","url":null,"abstract":"We consider an online advertisement system and focus on the impact of user interaction and response to targeted advertising campaigns. We analytically model the system dynamics accounting for the user behavior and devise strategies to maximize a relevant metric called click-through-intensity (CTI), defined as the number of clicks per time unit. With respect to the traditional click-through-rate (CTR) metric, CTI better captures the success of advertisements for services that the users may access several times, making multiple purchases or subscriptions. Examples include advertising of on-line games or airplane tickets. The model we develop is validated through traces of real advertising systems and allows us to optimize CTI under different scenarios depending on the nature of ad delivery and of the information available at the system. Experimental results show that our approach can increase the revenue of an ad campaign, even when user’s behavior can only be estimated.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117080908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"iModel","authors":"M. Awad, D. Menascé","doi":"10.1145/3374220","DOIUrl":"https://doi.org/10.1145/3374220","url":null,"abstract":"Deriving analytic performance models requires detailed knowledge of the architecture and behavior of the computer system being modeled as well as modeling skills. This detailed knowledge may not be readily available (or it may be impractical to gather) given the dynamic nature of production computing environments. This article presents a framework, called iModel, for automatically deriving and parameterizing analytic performance models for multi-tiered computer systems. Analytic performance models consist of a workload model and a system model. iModel uses system logs and configuration files to generate a high-level characterization of the system; e.g., open queuing network (QN) model versus closed QN model. By harvesting more information from the system logs and configuration files, iModel generates a workload model by inferring user-system interaction patterns in the form of a Customer Behavior Model Graph (CBMG) and generates a system model by discovering system components and their interaction patterns in the form of a Client-Server Interaction Diagram (CSID). iModel includes a library of well-known single-queue and QN models and their solutions stored in an XML-based repository. The generated workload model and system model are compared to the model repository to determine which model in the repository best matches the system’s observable behavior and architecture. This article also presents a black-box optimization approach that is used to derive analytic model parameters by observing the input-output relationships of a real system. This optimization approach can be used in any computer system (multi-tier or not) that can be modeled by single queues or QNs. The important question is whether the automatically generated and parameterized performance model has predictive power, i.e., can the derived model predict the output values that would be observed in the real system for different values of the input? The results presented in this article demonstrate that the analytic performance models derived by iModel are relatively robust and have predictive power over a wide range of input values.","PeriodicalId":105474,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125539723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}