Ning Jia, Chun Yang, Jing Wang, Dong Tong, Keyi Wang
{"title":"SPIRE: improving dynamic binary translation through SPC-indexed indirect branch redirecting","authors":"Ning Jia, Chun Yang, Jing Wang, Dong Tong, Keyi Wang","doi":"10.1145/2451512.2451516","DOIUrl":"https://doi.org/10.1145/2451512.2451516","url":null,"abstract":"Dynamic binary translation system must perform an address translation for every execution of indirect branch instructions. The procedure to convert Source binary Program Counter (SPC) address to Translated Program Counter (TPC) address always takes more than 10 instructions, becoming a major source of performance overhead. This paper proposes a novel mechanism called SPc-Indexed REdirecting (SPIRE), which can significantly reduce the indirect branch handling overhead. SPIRE doesn't rely on hash lookup and address mapping table to perform address translation. It reuses the source binary code space to build a SPC-indexed redirecting table. This table can be indexed directly by SPC address without hashing. With SPIRE, the indirect branch can jump to the originally SPC address without address translation. The trampoline residing in the SPC address will redirect the control flow to related code cache. Only 2-6 instructions are needed to handle an indirect branch execution. As part of the source binary would be overwritten, a shadow page mechanism is explored to keep transparency of the corrupt source binary code page. Online profiling is adopted to reduce the memory overhead.\u0000 We have implemented SPIRE on an x86 to x86 DBT system, and discussed the implementation issues on different guest and host architectures. The experiments show that, compared with hash lookup mechanism, SPIRE can reduce the performance overhead by 36.2% on average, up to 51.4%, while only 5.6% extra memory is needed.\u0000 SPIRE can cooperate with other indirect branch handling mechanisms easily, and we believe the idea of SPIRE can also be applied on other occasions that need address translation.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117232099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Limits of region-based dynamic binary parallelization","authors":"T. Koch, Björn Franke","doi":"10.1145/2451512.2451518","DOIUrl":"https://doi.org/10.1145/2451512.2451518","url":null,"abstract":"Efficiently executing sequential legacy binaries on chip multi-processors (CMPs) composed of many, small cores is one of today's most pressing problems. Single-threaded execution is a suboptimal option due to CMPs' lower single-core performance, while multi-threaded execution relies on prior parallelization, which is severely hampered by the low-level binary representation of applications compiled and optimized for a single-core target. A recent technology to address this problem is Dynamic Binary Parallelization (DBP), which creates a Virtual Execution Environment (VEE) taking advantage of the underlying multicore host to transparently parallelize the sequential binary executable. While still in its infancy, DBP has received broad interest within the research community. The combined use of DBP and thread-level speculation (TLS) has been proposed as a technique to accelerate legacy uniprocessor code on modern CMPs. In this paper, we investigate the limits of DBP and seek to gain an understanding of the factors contributing to these limits and the costs and overheads of its implementation. We have performed an extensive evaluation using a parameterizable DBP system targeting a CMP with light-weight architectural TLS support. We demonstrate that there is room for a significant reduction of up to 54% in the number of instructions on the critical paths of legacy SPEC CPU2006 benchmarks. However, we show that it is much harder to translate these savings into actual performance improvements, with a realistic hardware-supported implementation achieving a speedup of 1.09 on average.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":" 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120935660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Song, Jicheng Shi, Ran Liu, Jian Yang, Haibo Chen
{"title":"Parallelizing live migration of virtual machines","authors":"Xiang Song, Jicheng Shi, Ran Liu, Jian Yang, Haibo Chen","doi":"10.1145/2451512.2451531","DOIUrl":"https://doi.org/10.1145/2451512.2451531","url":null,"abstract":"Live VM migration is one of the major primitive operations to manage virtualized cloud platforms. Such operation is usually mission-critical and disruptive to the running services, and thus should be completed as fast as possible. Unfortunately, with the increasing amount of resources configured to a VM, such operations are becoming increasingly time-consuming.\u0000 In this paper, we make a comprehensive analysis on the parallelization opportunities of live VM migration on two popular open-source VMMs (i.e., Xen and KVM). By leveraging abundant resources like CPU cores and NICs in contemporary server platforms, we design and implement a system called PMigrate that leverages data parallelism and pipeline parallelism to parallelize the operation. As the parallelization framework requires intensive mmap/munmap operations that tax the address space management system in an operating system, we further propose an abstraction called range lock, which improves scalability of concurrent mutation to the address space of an operating system (i.e., Linux) by selectively replacing the per-process address space lock inside kernel with dynamic and fine-grained range locks that exclude costly operations on the requesting address range from using the per-process lock. Evaluation with our working prototype on Xen and KVM shows that PMigrate accelerates the live VM migration ranging from 2.49X to 9.88X, and decreases the downtime ranging from 1.9X to 279.89X. Performance analysis shows that our integration of range lock to Linux significantly improves parallelism in mutating the address space in VM migration and thus boosts the performance ranging from 2.06X to 3.05X. We also show that PMigrate makes only small disruption to other co-hosted production VMs.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134449592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Chen, Petros Maniatis, A. Perrig, Amit Vasudevan, V. Sekar
{"title":"Towards verifiable resource accounting for outsourced computation","authors":"Chen Chen, Petros Maniatis, A. Perrig, Amit Vasudevan, V. Sekar","doi":"10.1145/2451512.2451546","DOIUrl":"https://doi.org/10.1145/2451512.2451546","url":null,"abstract":"Outsourced computation services should ideally only charge customers for the resources used by their applications. Unfortunately, no verifiable basis for service providers and customers to reconcile resource accounting exists today. This leads to undesirable outcomes for both providers and consumers-providers cannot prove to customers that they really devoted the resources charged, and customers cannot verify that their invoice maps to their actual usage. As a result, many practical and theoretical attacks exist, aimed at charging customers for resources that their applications did not consume. Moreover, providers cannot charge consumers precisely, which causes them to bear the cost of unaccounted resources or pass these costs inefficiently to their customers.\u0000 We introduce ALIBI, a first step toward a vision for verifiable resource accounting. ALIBI places a minimal, trusted reference monitor underneath the service provider's software platform. This monitor observes resource allocation to customers' guest virtual machines and reports those observations to customers, for verifiable reconciliation. In this paper, we show that ALIBI efficiently and verifiably tracks guests' memory use and CPU-cycle consumption.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130767900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EXTERIOR: using a dual-VM based external shell for guest-OS introspection, configuration, and recovery","authors":"Yangchun Fu, Zhiqiang Lin","doi":"10.1145/2451512.2451534","DOIUrl":"https://doi.org/10.1145/2451512.2451534","url":null,"abstract":"This paper presents EXTERIOR, a dual-VM architecture based external shell that can be used for trusted, timely out-of-VM management of guest-OS such as introspection, configuration, and recovery. Inspired by recent advances in virtual machine introspection (VMI), EXTERIOR leverages an isolated, secure virtual machine (SVM) to introspect the kernel state of a guest virtual machine (GVM). However, it goes far beyond the read-only capability of the traditional VMI, and can perform automatic, fine-grained guest-OS writable operations. The key idea of EXTERIOR is to use a dual-VM architecture in which a SVM runs a kernel identical to that of the GVM to create the necessary environment for a running process (e.g., rmmod, kill), and dynamically and transparently redirect and update the memory state at the VMM layer from SVM to GVM, thereby achieving the same effect in terms of kernel state updates of running the same trusted in-VM program inside the shell of GVM. A proof-of-concept EXTERIOR has been implemented. The experimental results show that EXTERIOR can be used for a timely administration of guest-OS, including introspection and (re)configuration of the guest-OS state and timely response of kernel malware intrusions, without any user account in the guest-OS.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133000131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Demos Pavlou, E. Gibert, Fernando Latorre, Antonio González
{"title":"DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support","authors":"Demos Pavlou, E. Gibert, Fernando Latorre, Antonio González","doi":"10.1145/2151024.2151046","DOIUrl":"https://doi.org/10.1145/2151024.2151046","url":null,"abstract":"Dynamic Binary Translators (DBT) and Dynamic Binary Optimization (DBO) by software are used widely for several reasons including performance, design simplification and virtualization. However, the software layer in such systems introduces non-negligible overheads which affect performance and user experience. Hence, reducing DBT/DBO overheads is of paramount importance. In addition, reduced overheads have interesting collateral effects in the rest of the software layer, such as allowing optimizations to be applied earlier. A cost-effective solution to this problem is to provide hardware support to speed up the primitives of the software layer, paying special attention to automate DBT/DBO mechanisms and leave the heuristics to the software, which is more flexible. In this work, we have characterized the overheads of a DBO system using DynamoRIO implementing several basic optimizations. We have seen that the computation of the Data Dependence Graph (DDG) accounts for 5%-10% of the execution time. For this reason, we propose to add hardware support for this task in the form of a new functional unit, called DDGacc, which is integrated in a conventional pipeline processor and is operated through new ISA instructions. Our evaluation shows that DDGacc reduces the cost of computing the DDG by 32x, which reduces overall execution time by 5%-10% on average and up to 18% for applications where the DBO optimizes large code footprints.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"307 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115438701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Replacement attacks against VM-protected applications","authors":"S. Ghosh, Jason Hiser, J. Davidson","doi":"10.1145/2151024.2151051","DOIUrl":"https://doi.org/10.1145/2151024.2151051","url":null,"abstract":"Process-level virtualization is increasingly being used to enhance the security of software applications from reverse engineering and unauthorized modification (called software protection). Process-level virtual machines (PVMs) can safeguard the application code at run time and hamper the adversary's ability to launch dynamic attacks on the application. This dynamic protection, combined with its flexibility, ease in handling legacy systems and low performance overhead, has made process-level virtualization a popular approach for providing software protection. While there has been much research on using process-level virtualization to provide such protection, there has been less research on attacks against PVM-protected software. In this paper, we describe an attack on applications protected using process-level virtualization, called a replacement attack. In a replacement attack, the adversary replaces the protecting PVM with an attack VM thereby rendering the application vulnerable to analysis and modification. We present a general description of the replacement attack methodology and two attack implementations against a protected application using freely available tools. The generality and simplicity of replacement attacks demonstrates that there is a strong need to develop techniques that meld applications more tightly to the protecting PVM to prevent such attacks.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114455291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unpicking the knot: teasing apart VM/application interdependencies","authors":"Yi Lin, S. Blackburn, Daniel Frampton","doi":"10.1145/2151024.2151048","DOIUrl":"https://doi.org/10.1145/2151024.2151048","url":null,"abstract":"Flexible and efficient runtime design requires an understanding of the dependencies among the components internal to the runtime and those between the application and the runtime. These dependencies are frequently unclear. This problem exists in all runtime design, and is most vivid in a metacircular runtime --- one that is implemented in terms of itself. Metacircularity blurs boundaries between application and runtime implementation, making it harder to understand and make guarantees about overall system behavior, affecting isolation, security, and resource management, as well as reducing opportunities for optimization. Our goal is to shed new light on VM interdependencies, helping all VM designers understand these dependencies and thereby engineer better runtimes. We explore these issues in the context of a high-performance Java-in-Java virtual machine. Our approach is to identify and instrument transition points into and within the runtime, which allows us to establish a dynamic execution context. Our contributions are: 1) implementing and measuring a system that dynamically maintains execution context with very low overhead, 2) demonstrating that such a framework can be used to improve the software engineering of an existing runtime, and 3) analyzing the behavior and runtime characteristics of our runtime across a wide range of benchmarks. Our solution provides clarity about execution state and allowable transitions, making it easier to develop, debug, and understand managed runtimes.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127376264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Wang, Tanima Dey, Ryan W. Moore, Mahmut Aktasoglu, B. Childers, J. Davidson, M. J. Irwin, M. Kandemir, M. Soffa
{"title":"REEact: a customizable virtual execution manager for multicore platforms","authors":"Wei Wang, Tanima Dey, Ryan W. Moore, Mahmut Aktasoglu, B. Childers, J. Davidson, M. J. Irwin, M. Kandemir, M. Soffa","doi":"10.1145/2151024.2151031","DOIUrl":"https://doi.org/10.1145/2151024.2151031","url":null,"abstract":"With the shift to many-core chip multiprocessors (CMPs), a critical issue is how to effectively coordinate and manage the execution of applications and hardware resources to overcome performance, power consumption, and reliability challenges stemming from hardware and application variations inherent in this new computing environment. Effective resource and application management on CMPs requires consideration of user/application/hardware-specific requirements and dynamic adaption of management decisions based on the actual run-time environment. However, designing an algorithm to manage resources and applications that can dynamically adapt based on the run-time environment is difficult because most resource and application management and monitoring facilities are only available at the operating system level. This paper presents REEact, an infrastructure that provides the capability to specify user-level management policies with dynamic adaptation. REEact is a virtual execution environment that provides a framework and core services to quickly enable the design of custom management policies for dynamically managing resources and applications. To demonstrate the capabilities and usefulness of REEact, this paper describes three case studies--each illustrating the use of REEact to apply a specific dynamic management policy on a real CMP. Through these case studies, we demonstrate that REEact can effectively and efficiently implement policies to dynamically manage resources and adapt application execution.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130947716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sajib Kundu, R. Rangaswami, Ajay Gulati, Ming Zhao, K. Dutta
{"title":"Modeling virtualized applications using machine learning techniques","authors":"Sajib Kundu, R. Rangaswami, Ajay Gulati, Ming Zhao, K. Dutta","doi":"10.1145/2151024.2151028","DOIUrl":"https://doi.org/10.1145/2151024.2151028","url":null,"abstract":"With the growing adoption of virtualized datacenters and cloud hosting services, the allocation and sizing of resources such as CPU, memory, and I/O bandwidth for virtual machines (VMs) is becoming increasingly important. Accurate performance modeling of an application would help users in better VM sizing, thus reducing costs. It can also benefit cloud service providers who can offer a new charging model based on the VMs' performance instead of their configured sizes. In this paper, we present techniques to model the performance of a VM-hosted application as a function of the resources allocated to the VM and the resource contention it experiences. To address this multi-dimensional modeling problem, we propose and refine the use of two machine learning techniques: artificial neural network (ANN) and support vector machine (SVM). We evaluate these modeling techniques using five virtualized applications from the RUBiS and Filebench suite of benchmarks and demonstrate that their median and 90th percentile prediction errors are within 4.36% and 29.17% respectively. These results are substantially better than regression based approaches as well as direct applications of machine learning techniques without our refinements. We also present a simple and effective approach to VM sizing and empirically demonstrate that it can deliver optimal results for 65% of the sizing problems that we studied and produces close-to-optimal sizes for the remaining 35%.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122401044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}