Eunji Lee, H. Bahn, Minseong Jeong, Sunghwan Kim, Jesung Yeon, S. Yoo, S. Noh, K. Shin
{"title":"Reducing Journaling Harm on Virtualized I/O Systems","authors":"Eunji Lee, H. Bahn, Minseong Jeong, Sunghwan Kim, Jesung Yeon, S. Yoo, S. Noh, K. Shin","doi":"10.1145/2928275.2928289","DOIUrl":"https://doi.org/10.1145/2928275.2928289","url":null,"abstract":"This paper analyzes the host cache effectiveness in full virtualization, particularly associated with journaling of guests. We observe that the journal access of guests degrades cache performance largely due to the write-once access pattern and the frequent sync operations. To remedy this problem, we design and implement a novel caching policy, called PDC (Pollution Defensive Caching), that detects the journal accesses and prevents them from entering the host cache. The proposed PDC is implemented in QEMU-KVM 2.1 on Linux 4.14 and provides 3-32% performance improvement for various file and I/O benchmarks.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81566727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yushi Liang, Yunpeng Chai, Ning Bao, Hengyu Chen, Yao-Hong Liu
{"title":"Elastic Queue: A Universal SSD Lifetime Extension Plug-in for Cache Replacement Algorithms","authors":"Yushi Liang, Yunpeng Chai, Ning Bao, Hengyu Chen, Yao-Hong Liu","doi":"10.1145/2928275.2928286","DOIUrl":"https://doi.org/10.1145/2928275.2928286","url":null,"abstract":"Flash-based solid-state drives (SSDs) are getting popular to be deployed as the second-level cache in storage systems because of the noticeable performance acceleration and transparency for the original software. However, the frequent data updates of existing cache replacement algorithms (e.g. LRU, LIRS, and LARC) causes too many writes on SSDs, leading to short lifetime and high costs of devices. SSD-oriented cache schemes with less SSD writes have fixed strategies of selecting cache blocks, so we cannot freely choose a suitable cache algorithm to adapt to application features for higher performance. Therefore, a universal SSD lifetime extension plug-in called Elastic Queue (EQ), which can cooperate with any cache algorithm to extend the lifetime of SSDs, is proposed in this paper. EQ reduces the data updating frequency by extending the eviction border of cache blocks elastically, making SSD devices serve much longer. The experimental results based on some real-world traces indicate that for the original LRU, LIRS, and LARC schemes, adding the EQ plug-in reduces their SSD write amounts by 39.03 times, and improves the cache hit rates by 17.30% on average at the same time.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"115 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78183006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abed Abu Dbai, David Breitgand, G. Gershinsky, A. Glikson, K. Ahmed
{"title":"Enterprise Resource Management in Mesos Clusters","authors":"Abed Abu Dbai, David Breitgand, G. Gershinsky, A. Glikson, K. Ahmed","doi":"10.1145/2928275.2933272","DOIUrl":"https://doi.org/10.1145/2928275.2933272","url":null,"abstract":"Enterprise data centers increasingly adopt a cloud-like architecture that enables the execution of multiple workloads on a shared pool of resources, reduces the data center footprint and drives down the costs. A number of cluster resource managers have appeared over the last few years, aimed at providing a uniform technology-neutral resource representation and management substrate. Examples include Apache YARN, Google Borg and Omega, Apache Mesos, and IBM Platform EGO. The Apache Mesos project [2] is emerging as a leading open source resource management technology for server clusters. Mesos offers simple yet powerful and flexible APIs, highly available and fault tolerant architecture, scalability to large clusters, isolation between tasks using Linux containers, multi-dimensional resource scheduling, ability to allocate shares of the cluster to roles representing users or user groups, and a clear separation of concerns between the applications (termed frameworks) and the \"cluster kernel\", which is Mesos. The resource scheduler of Mesos supports a generalization of max-min fairness, termed Dominant Resource Fairness (DRF) [1] scheduling discipline, which allows to harmonize execution of heterogeneous workloads (in terms of resource demand) by maximizing the share of any resource allocated to a specific framework. However, the default Mesos allocation mechanism lacks a number of policy and tenancy capabilities, important in enterprise deployments. We have investigated integration of Mesos with the IBM EGO (enterprise grid orchestrator) technology [3] which underpins various high performance computing, analytics and big data clusters in a variety of industry verticals including financial services, life sciences, manufacturing and electronics. We have designed and implemented an experimental integration prototype, and have tested it with SparkBench workloads. We demonstrate how Mesos can be enriched with new resource policy capabilities, required for managing enterprise data centers, such as • Capturing of the hierarchical structure of an enterprise (organisations, departments, groups, teams, users) by defining the corresponding resource consumer tree; • A fine grained resource plan allowing to define resource share ratio, ownership and lending/borrowing policies for each resource consumer; • A rich set of resource management policies making use of the hierarchical resource consumer model and providing fairness and isolation to the members of hierarchy including an important ability to dynamically change the allocations (time-based policy); • A Web-based GUI providing a centralized console through which the whole cluster is observed and managed. In particular, the cluster-wide resource management policies are applied through this GUI.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76439031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Robotic Mobile Hot Spot Relay (MHSR) for Disaster Areas","authors":"Itai Dabran, Tom Palny","doi":"10.1145/2928275.2933279","DOIUrl":"https://doi.org/10.1145/2928275.2933279","url":null,"abstract":"Rescue forces in disaster areas mostly use Mobile Ad-Hoc networks to enable quick communication facilities over various physical barriers. Such networks consist of mobile terminals connected to base stations (BS) or Access Points (AP) in order to transmit essential information to the outer world. In disaster areas, rescue forces are equipped with a PAN (Personal Area Network) which combines devices such as medical, vibration and noise sensors. In such areas where communication conditions are unstable, it is essential to deploy the infrastructure as soon as possible. For example, the authors of [2] propose an implementation of an Autonomous P2P Ad-Hoc Group Communication that supports the need of emergency communication in earthquake disaster areas. In [3] a model for developing Ad Hoc Network configuration technologies is proposed. This model, the Disaster Area Architecture, improves information exchange and coordination among the participants. We present a small self propelled robot. Our robot is resistant to mechanical damage [4] and operates as a communication relay in order to overcome communication disorders inside ruins or tunnels, between the PAN and the outer world. This robot is called when there is no direct connection towards a wireless Access Point (AP), using a short range communication request (by a smartphone for example). Our Mobile Hot Spot Relay (MHSR) depicted in Figure 1, moves independently and can be used in a disaster area, where it can be mobilized upon a request. While moving, it monitors the Wi-Fi signal towards the AP and when it goes under a certain (predefined) threshold it stops.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88262768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing Optical Circuits in Hybrid Packet/Circuit Data-Center Networks","authors":"Y. Ben-Itzhak, C. Caba, José Soler","doi":"10.1145/2928275.2933284","DOIUrl":"https://doi.org/10.1145/2928275.2933284","url":null,"abstract":"Existing Data Center Networks (DCNs) continue to evolve to keep up with application requirements in terms of bandwidth, latency, agility, etc. According to the updated release of the Cisco Global Cloud Index [1], by 2019, more than 86% of traffic workloads will be processed by cloud DCs. Traditional DCNs, which are based on electrical packet switching (EPS) with hierarchical, tree-like topologies can no longer support future cloud traffic requirements in terms of dynamicity, bandwidth and latency. Hence, existing DCNs can be enhanced with OCS (Optical Circuit Switching), which provides high bandwidth, low latency and low power consumption [2], giving rise to hybrid OCS-EPS topologies. In this research, we assess a virtualized, hybrid, flat DCN topology consisting of a single layer of high radix ToR (Top of Rack) switches, interconnected with each other and through an OCS plane. The benefit of such flat topology is twofold: 1) In terms of bandwidth, over-subscription is reduced, and bisection bandwidth is increased; and 2) In terms of latency, the diameter (longest path) of topology is reduced. Moreover, we present new algorithms and orchestration functionality to detect and offload suitable flows (e.g. elephant flows) from the EPS to the OCS plane. Our DC architecture consists of hybrid EPS-OCS DCN, an Openflow(OF) based control plane, and an orchestration layer. Our orchestration layer decouples the elephant flows detection from the rerouting decision logic in the DCN. Specifically, the elephant flows detection is done by flow tagging in the hypervisor, while the flow rerouting is executed at the EPSs, which are connected directly to the OCS. Hence, it provides a more efficient, scalable, and easy to configure architecture as compared to existing hybrid solutions. The orchestrator monitors the ToR switches by sFlow and detects high volume traffic between two ToRs, exceeding a given bandwidth threshold. Such traffic may consist of either few elephant flows or many mice flows. To further increase the optical circuit utilization, we introduce two types of optical circuits: 1) private circuit, presented in existing solutions, is utilized only by flows that originate and end at the ToR switches connected to the circuit endpoints. 2) shared circuit, is part of our novel approach. It can be used also by flows that are transmitted through ToR switches connected to the circuit endpoints, but originate and/or end at other ToRs. Moreover, the orchestrator may dynamically decide to configure private or shared optical circuits, according to various criteria including current network utilization, traffic flows nature, tenants SLAs, etc. Configuring or changing the optical circuit type requires installing a single OpenFlow rule for each ToR connected to the circuit endpoints; hence, enabling low overhead and fast network configuration. To assess the benefit of such optical circuit configurations, we implement the proposed algorithms and test them over an emula","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"52 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74182214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Iyswarya Narayanan, Di Wang, Myeongjae Jeon, Bikash Sharma, Laura Caulfield, A. Sivasubramaniam, Ben Cutler, Jie Liu, Badriddine M. Khessib, Kushagra Vaid
{"title":"SSD Failures in Datacenters: What? When? and Why?","authors":"Iyswarya Narayanan, Di Wang, Myeongjae Jeon, Bikash Sharma, Laura Caulfield, A. Sivasubramaniam, Ben Cutler, Jie Liu, Badriddine M. Khessib, Kushagra Vaid","doi":"10.1145/2928275.2928278","DOIUrl":"https://doi.org/10.1145/2928275.2928278","url":null,"abstract":"Despite the growing popularity of Solid State Disks (SSDs) in the datacenter, little is known about their reliability characteristics in the field. The little knowledge is mainly vendor supplied, and such information cannot really help understand how SSD failures can manifest and impact the operation of production systems, in order to take appropriate remedial measures. Besides actual failure data and the symptoms exhibited by SSDs before failing, a detailed characterization effort requires wide set of data about factors influencing SSD failures, right from provisioning factors to the operational ones. This paper presents an extensive SSD failure characterization by analyzing a wide spectrum of data from over half a million SSDs that span multiple generations spread across several datacenters which host a wide spectrum of workloads over nearly 3 years. By studying the diverse set of design, provisioning and operational factors on failures, and their symptoms, our work provides the first comprehensive analysis of the what, when and why characteristics of SSD failures in production datacenters.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87515059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proper Timed I/O: High-Accuracy Real-Time Control for Conventional Operating Systems","authors":"Yogev Vaknin, Sivan Toledo","doi":"10.1145/2928275.2928283","DOIUrl":"https://doi.org/10.1145/2928275.2928283","url":null,"abstract":"We propose a novel high-level abstraction for real-time control, called Proper Timed I/O (PTIO). The abstraction allows user-space programs running on a stock operating system (without real-time extensions) to perform high-resolution real-time digital I/O (setting pins high or low, responding to input transitions, etc.). PTIO programs express their real-time I/O behavior in terms of a timed automaton that can communicate with the user-space program. Simple behaviors are encoded in the timed automaton; complex behaviors are implemented by the user-space program. We present two implementations of the PTIO abstraction, both for Linux. One utilizes a deterministic co-processor that is available on some ARM-based system-on-a-chip processors. This implementation can achieve timing accuracy of 100ns or better and can perform millions of finite-state transitions per second. The other implementation uses hardware timers that are available on every system-on-a-chip; it achieves a timing accuracy of 6µs or better, but it is limited to about 2000 state transitions per second. Both implementations guarantee that the PTIO never fails silently: if the mechanism missed a deadline, the user space program is always notified. In many cases, PTIOs eliminate the need for bare-metal programming or for specialized real-time operating systems.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78673145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"File System Usage in Android Mobile Phones","authors":"R. Friedman, David Sainz","doi":"10.1145/2928275.2928280","DOIUrl":"https://doi.org/10.1145/2928275.2928280","url":null,"abstract":"In this paper, we report on the analysis of data from Android mobile phones of 38 users, composed of access traces of the users' mobile file systems during 30 days. We shed new light on the file usage patterns and present the data in terms of file size distributions, file sessions, file lifetime, file access activity and read / write access patterns. We characterize different distributions and extract conclusions about usage patterns of Android file systems.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78847835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SeMiNAS: A Secure Middleware for Wide-Area Network-Attached Storage","authors":"Ming Chen, E. Zadok, A. Vasudevan, Kelong Wang","doi":"10.1145/2928275.2928282","DOIUrl":"https://doi.org/10.1145/2928275.2928282","url":null,"abstract":"Utility computing is being gradually realized as exemplified by cloud computing. Outsourcing computing and storage to global-scale cloud providers benefits from high accessibility, flexibility, scalability, and cost-effectiveness. However, users are uneasy outsourcing the storage of sensitive data due to security concerns. We address this problem by presenting SeMiNAS---an efficient middleware system that allows files to be securely outsourced to providers and shared among geo-distributed offices. SeMiNAS achieves end-to-end data integrity and confidentiality with a highly efficient authenticated-encryption scheme. SeMiNAS leverages advanced NFSv4 features, including compound procedures and data-integrity extensions, to minimize extra network round trips caused by security meta-data. SeMiNAS also caches remote files locally to reduce accesses to providers over WANs. We designed, implemented, and evaluated SeMiNAS, which demonstrates a small performance penalty of less than 26% and an occasional performance boost of up to 19% for Filebench workloads.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85508370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-ISA Container Migration","authors":"J. Nider, Mike Rapoport","doi":"10.1145/2928275.2933275","DOIUrl":"https://doi.org/10.1145/2928275.2933275","url":null,"abstract":"Containers are a convenient way of encapsulating and isolating applications. They incur less overhead than virtual machines and provide more flexibility and versatility to improve server utilization. Many new cloud applications are being written in the microservices style to take advantage of container technologies. Each component of the application can be encapsulated in a separate container, which enables the use of other features such as auto-scaling. However, legacy applications can also benefit from containers which provide more efficient development and deployment models. In modern data centers, orchestration middle-ware is responsible for container placement, SLA enforcement and resource management. The orchestration software can implement various policies for managing the resources. The orchestration software can take corrective actions when detecting inefficiencies in the data center operation to satisfy the current policy. Power efficiency is becoming one of the most important characteristics taken into account when designing a data center and defining policy for the orchestration middleware [4]. Different server architectures have different power efficiency and energy proportionality characteristics. Recent research has shown that heterogeneous systems have the potential to significantly improve energy efficiency[3, 5]. Our work focuses on the mechanism required by the middle-ware to implement a power optimization policy. We research migration of containerized applications between servers inside a heterogeneous data center, for the purpose of optimizing power efficiency. Migrating a running container between different architectures relies on the compatibility of the application environment on the source and destination servers. Containers are viewed as a set of one or more processes and each process must have the ability to be migrated. A modified compiler is used to build executables in a manner allowing the program migration between different architectures. The source and destination servers must also have a shared file system and comparable networking capabilities. We take advantage of the recently added user-space page fault feature in the Linux kernel [2] to implement post-copy container migration in CRIU [1]. Post-copy migration significantly reduces perceived down-time of the container, and can potentially reduce network traffic as well. We propose creating a cluster of servers with different architectures (i.e., ARM, POWER, and x86) connected with a high-speed, low-latency network. This cluster will run SaaS applications in a containerized environment. The applications will be built using a specialized toolchain that ensures an identical memory layout across all architectures, enabling seamless migration at runtime. The majority of the challenges in cross-ISA migration are related to the toolchain adaptation, and ensuring the compatibility of the runtime environment across various servers in the cluster. The ability to efficien","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89212721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}