{"title":"PEMU: A Pin Highly Compatible Out-of-VM Dynamic Binary Instrumentation Framework","authors":"Junyuan Zeng, Yangchun Fu, Zhiqiang Lin","doi":"10.1145/2731186.2731201","DOIUrl":"https://doi.org/10.1145/2731186.2731201","url":null,"abstract":"Over the past 20 years, we have witnessed a widespread adoption of dynamic binary instrumentation (DBI) for numerous program analyses and security applications including program debugging, profiling, reverse engineering, and malware analysis. To date, there are many DBI platforms, and the most popular one is Pin, which provides various instrumentation APIs for process instrumentation. However, Pin does not support the instrumentation of OS kernels. In addition, the execution of the instrumentation and analysis routine is always inside the virtual machine (VM). Consequently, it cannot support any out-of-VM introspection that requires strong isolation. Therefore, this paper presents PEMU, a new open source DBI framework that is compatible with Pin-APIs, but supports out-of-VM introspection for both user level processes and OS kernels. Unlike in-VM instrumentation in which there is no semantic gap, for out-of-VM introspection we have to bridge the semantic gap and provide abstractions (i.e., APIs) for programmers. One important feature of PEMU is its API compatibility with Pin. As such, many Pin plugins are able to execute atop PEMU without any source code modification. We have implemented PEMU, and our experimental results with the SPEC 2006 benchmarks show that PEMU introduces reasonable overhead.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125219836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng-Chun Tu, M. Ferdman, Chao-Tang Lee, T. Chiueh
{"title":"A Comprehensive Implementation and Evaluation of Direct Interrupt Delivery","authors":"Cheng-Chun Tu, M. Ferdman, Chao-Tang Lee, T. Chiueh","doi":"10.1145/2731186.2731189","DOIUrl":"https://doi.org/10.1145/2731186.2731189","url":null,"abstract":"As the performance overhead associated with CPU and memory virtualization becomes largely negligible, research efforts are directed toward reducing the I/O virtualization overhead, which mainly comes from two sources: DMA set-up and payload copy, and interrupt delivery. The advent of SRIOV and MRIOV effectively reduces the DMA-related virtualization overhead to a minimum. Therefore, the last battleground for minimizing virtualization overhead is how to directly deliver every interrupt to its target VM without involving the hypervisor. This paper describes the design, implementation, and evaluation of a KVM-based direct interrupt delivery system called DID. DID delivers interrupts from SRIOV devices, virtual devices, and timers to their target VMs directly, completely avoiding VM exits. Moreover, DID does not require any modifications to the VM's operating system and preserves the correct priority among interrupts in all cases. We demonstrate that DID reduces the number of VM exits by a factor of 100 for I/O-intensive workloads, decreases the interrupt invocation latency by 80%, and improves the throughput of a VM running Memcached by a factor of 3.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125572785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephen C. Kyle, Hugh Leather, Björn Franke, D. Butcher, Stuart Monteith
{"title":"Application of Domain-aware Binary Fuzzing to Aid Android Virtual Machine Testing","authors":"Stephen C. Kyle, Hugh Leather, Björn Franke, D. Butcher, Stuart Monteith","doi":"10.1145/2731186.2731198","DOIUrl":"https://doi.org/10.1145/2731186.2731198","url":null,"abstract":"The development of a new application virtual machine (VM), like the creation of any complex piece of software, is a bug-prone process. In version 5.0, the widely-used Android operating system has changed from the Dalvik VM to the newly-developed ART VM to execute Android applications. As new iterations of this VM are released, how can the developers aim to reduce the number of potentially security-threatening bugs that make it into the final product? In this paper we combine domain-aware binary fuzzing and differential testing to produce DexFuzz, a tool that exploits the presence of multiple modes of execution within a VM to test for defects. These modes of execution include the interpreter and a runtime that executes ahead-of-time compiled code. We find and present a number of bugs in the in-development version of ART in the Android Open Source Project. We also assess DexFuzz's ability to highlight defects in the experimental version of ART released in the previous version of Android, 4.4, finding 189 crashing programs and 15 divergent programs that indicate defects after only 5,000 attempts.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129237271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fei Guo, Seongbeom Kim, Y. Baskakov, Ishan Banerjee
{"title":"Proactively Breaking Large Pages to Improve Memory Overcommitment Performance in VMware ESXi","authors":"Fei Guo, Seongbeom Kim, Y. Baskakov, Ishan Banerjee","doi":"10.1145/2731186.2731187","DOIUrl":"https://doi.org/10.1145/2731186.2731187","url":null,"abstract":"VMware ESXi leverages hardware support for MMU virtualization available in modern Intel/AMD CPUs. To optimize address translation performance when running on such CPUs, ESXi preferably uses host large pages (2MB in x86-64 systems) to back VM's guest memory. While using host large pages provides best performance when host has sufficient free memory, it increases host memory pressure and effectively defeats page sharing. Hence, the host is more likely to hit the point where ESXi has to reclaim VM memory through much more expensive techniques such as ballooning or host swapping. As a result, using host large pages may significantly hurt consolidation ratio. To deal with this problem, we propose a new host large page management policy that allows to: a) identify 'cold' large pages and break them even when host has plenty of free memory; b) break all large pages proactively when host free memory becomes scarce, but before the host starts ballooning or swapping; c) reclaim the small pages within the broken large pages through page sharing. With the new policy, the shareable small pages can be shared much earlier and the amount of memory that needs to be ballooned or swapped can be largely reduced when host memory pressure is high. We also propose an algorithm to dynamically adjust the page sharing rate when proactively breaking large pages using a VM large page shareability estimator for higher efficiency. Experimental results show that the proposed large page management policy can improve the performance of various workloads up to 2.1x by significantly reducing the amount of ballooned or swapped memory when host memory pressure is high. Applications still fully benefit from host large pages when memory pressure is low.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127901686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards VM Consolidation Using a Hierarchy of Idle States","authors":"R. Singh, Tim Brecht, S. Keshav","doi":"10.1145/2731186.2731195","DOIUrl":"https://doi.org/10.1145/2731186.2731195","url":null,"abstract":"Typical VM consolidation approaches re-pack VMs into fewer physical machines, resulting in energy and cost savings [13, 19, 23, 40]. Recent work has explored a just-in time approach to VM consolidation by transitioning VMsto an inactive state when idle and activating them on the arrival of client requests[17, 21]. This leads to increased VM density at the cost of an increase in client request latency (called miss penalty). The VM density so obtained, although greater, is still limited by the number of VMs that can be hosted in the one inactive state. If idle VMs were hosted in multiple inactive states, VM density can be increased further while ensuring small miss penalties. However, VMs in different inactive states have different capacities, activation times, and resource requirements. Therefore, a key question is: How should VMs be transitioned between different states to minimize the expected miss penalty? This paper explores the hosting of idle VMs in a hierarchy of multiple such inactive states, and studies the effect of different idle VMmanagement policies on VMdensity and miss penalties. We formulate a mathematical model for the problem, and provide a theoretical lower bound on the miss penalty. Using an off-the-shelf virtualization solution (LXC [2]), we demonstrate how the required model parameters can be obtained. We evaluate a variety of policies and quantify their miss penalties for different VM densities. We observe that some policies consolidate up to 550 VMs per machine with average miss penalties smaller than 1 ms.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131166412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas Pfefferle, Patrick Stuedi, A. Trivedi, B. Metzler, Ioannis Koltsidas, T. Gross
{"title":"A Hybrid I/O Virtualization Framework for RDMA-capable Network Interfaces","authors":"Jonas Pfefferle, Patrick Stuedi, A. Trivedi, B. Metzler, Ioannis Koltsidas, T. Gross","doi":"10.1145/2731186.2731200","DOIUrl":"https://doi.org/10.1145/2731186.2731200","url":null,"abstract":"DMA-capable interconnects, providing ultra-low latency and high bandwidth, are increasingly being used in the context of distributed storage and data processing systems. However, the deployment of such systems in virtualized data centers is currently inhibited by the lack of a flexible and high-performance virtualization solution for RDMA network interfaces. In this work, we present a hybrid virtualization architecture which builds upon the concept of separation of paths for control and data operations available in RDMA. With hybrid virtualization, RDMA control operations are virtualized using hypervisor involvement, while data operations are set up to bypass the hypervisor completely. We describe HyV (Hybrid Virtualization), a virtualization framework for RDMA devices implementing such a hybrid architecture. In the paper, we provide a detailed evaluation of HyV for different RDMA technologies and operations. We further demonstrate the advantages of HyV in the context of a real distributed system by running RAMCloud on a set of HyV-enabled virtual machines deployed across a 6-node RDMA cluster. All of the performance results we obtained illustrate that hybrid virtualization enables bare-metal RDMA performance inside virtual machines while retaining the flexibility typically associated with paravirtualization.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132767022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Remote Desktopping Through Adaptive Record/Replay","authors":"Shehbaz Jaffer, Piyus Kedia, Sorav Bansal","doi":"10.1145/2731186.2731193","DOIUrl":"https://doi.org/10.1145/2731186.2731193","url":null,"abstract":"Accessing the display of a computer remotely, is popularly called remote desktopping. Remote desktopping software installs at both the user-facing client computer and the remote server computer; it simulates user's input events at server, and streams the corresponding display changes to client, thus providing an illusion to the user of controlling the remote machine using local input devices (e.g., keyboard/mouse). Many such remote desktopping tools are widely used. We show that if the remote server is a virtual machine (VM) and the client is reasonably powerful (e.g., current laptop and desktop grade hardware), VM deterministic replay capabilities can be used adaptively to significantly reduce the network bandwidth consumption and server-side CPU utilization of a remote desktopping tool. We implement these optimizations in a tool based on Qemu/KVM virtualization platform and VNC remote desktopping platform. Our tool reduces VNC's network bandwidth consumption by up to 9x and server-side CPU utilization by up to 56% for popular graphics-intensive applications. On the flip side, our techniques consume higher CPU/memory/disk resources at the client. The effect of our optimizations on user-perceived latency is negligible.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134314170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Cui, Tianyu Wo, B. Li, Jianxin Li, Bin Shi, J. Huai
{"title":"PARS: A Page-Aware Replication System for Efficiently Storing Virtual Machine Snapshots","authors":"Lei Cui, Tianyu Wo, B. Li, Jianxin Li, Bin Shi, J. Huai","doi":"10.1145/2731186.2731190","DOIUrl":"https://doi.org/10.1145/2731186.2731190","url":null,"abstract":"Virtual machine (VM) snapshot enhances the system availability by saving the running state into stable storage during failure-free execution and rolling back to the snapshot point upon failures. Unfortunately, the snapshot state may be lost due to disk failures, so that the VM fails to be recovered. The popular distributed file systems employ replication technique to tolerate disk failures by placing redundant copies across disperse disks. However, unless user-specific personalization is provided, these systems consider the data in the file as of same importance and create identical copies of the entire file, leading to non-trivial additional storage overhead. This paper proposes a page-aware replication system (PARS) to store VM snapshots efficiently. PARS employs VM introspection technique to explore how a page is used by guest, and classifies the pages by their importance to system execution. If a page is critical, PARS replicates it multiple copies to ensure high availability and long-term durability. Otherwise, the loss of this page causes no harm for system to work properly, PARS therefore saves only one copy of the page. Consequently, PARS improves storage efficiency without compromising availability. We have implemented PARS to justify its practicality. The experimental results demonstrate that PARS achieves 53.9% space saving compared to the native replication approach in HDFS which replicates the whole snapshot file fully and identically.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132976116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhe Wang, Jianjun Li, Chenggang Wu, Dongyan Yang, Zhenjiang Wang, W. Hsu, Bin Li, Yong Guan
{"title":"HSPT: Practical Implementation and Efficient Management of Embedded Shadow Page Tables for Cross-ISA System Virtual Machines","authors":"Zhe Wang, Jianjun Li, Chenggang Wu, Dongyan Yang, Zhenjiang Wang, W. Hsu, Bin Li, Yong Guan","doi":"10.1145/2731186.2731188","DOIUrl":"https://doi.org/10.1145/2731186.2731188","url":null,"abstract":"Cross-ISA (Instruction Set Architecture) system-level virtual machine has a significant research and practical value. For example, several recently announced virtual smart phones for iOS which run smart phone applications on x86 based PCs are deployed on cross-ISA system level virtual machines. Also, for mobile device application development, by emulating the Android/ARM environment on the more powerful x86-64 platform, application development and debugging become more convenient and productive. However, the virtualization layer often incurs high performance overhead. The key overhead comes from memory virtualization where a guest virtual address (GVA) must go through multi-level address translation to become a host physical address (HPA). The Embedded Shadow Page Table (ESPT) approach has been proposed to effectively decrease this address translation cost. ESPT directly maps GVA to HPA, thus avoid the lengthy guest virtual to guest physical, guest physical to host virtual, and host virtual to host physical address translation. However, the original ESPT work has a few drawbacks. For example, its implementation relies on a loadable kernel module (LKM) to manage the shadow page table. Using LKMs is less desirable for system virtual machines due to portability, security and maintainability concerns. Our work proposes a different, yet more practical, implementation to address the shortcomings. Instead of relying on using LKMs, our approach adopts a shared memory mapping scheme to maintain the shadow page table (SPT) using only ''mmap'' system call. Furthermore, this work studies the support of SPT for multi-processing in greater details. It devices three different SPT organizations and evaluates their strength and weakness with standard and real Android applications on the system virtual machine which emulates the Android/ARM platform on x86-64 systems.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127772273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JinSeok Oh, Jin-woo Kwon, Hyukwoo Park, Soo-Mook Moon
{"title":"Migration of Web Applications with Seamless Execution","authors":"JinSeok Oh, Jin-woo Kwon, Hyukwoo Park, Soo-Mook Moon","doi":"10.1145/2731186.2731197","DOIUrl":"https://doi.org/10.1145/2731186.2731197","url":null,"abstract":"Web applications (apps) are programmed using HTML5, CSS, and JavaScript, and are distributed in the source code format. Web apps can be executed on any devices where a web browser is installed, allowing one-source, multi-platform environment. We can exploit this advantage of platform independence for a new user experience called app migration, which allows migrating an app in the middle of execution seamlessly between smart devices. This paper proposes such a migration framework for web apps where we can save the current state of a running app and resume its execution on a different device by restoring the saved state. We save the web app's state in the form of a snapshot, which is actually another web app whose execution can restore the saved state. In the snapshot, the state of the JavaScript variables and DOM trees are saved using the JSON format. We solved some of the saving/restoring problems related to event handlers and closures by accessing the browser and the JavaScript engine internals. Our framework does not require instrumenting an app or changing its source code, but works for the original app. We implemented the framework on the Chrome browser with the V8 JavaScript engine and successfully migrated non-trivial sample apps with reasonable saving and restoring overhead. We also discuss other usage of the snapshot for optimizations and user experiences for the web platform.","PeriodicalId":186972,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122710660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}