K. Chang, A. G. Yaglikçi, Saugata Ghose, Aditya Agrawal, Niladrish Chatterjee, Abhijith Kashyap, Donghyuk Lee, Mike O'Connor, Hasan Hassan, O. Mutlu
{"title":"Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms","authors":"K. Chang, A. G. Yaglikçi, Saugata Ghose, Aditya Agrawal, Niladrish Chatterjee, Abhijith Kashyap, Donghyuk Lee, Mike O'Connor, Hasan Hassan, O. Mutlu","doi":"10.1145/3078505.3078590","DOIUrl":"https://doi.org/10.1145/3078505.3078590","url":null,"abstract":"The energy consumption of DRAM is a critical concern in modern computing systems. Improvements in manufacturing process technology have allowed DRAM vendors to lower the DRAM supply voltage conservatively, which reduces some of the DRAM energy consumption. We would like to reduce the DRAM supply voltage more aggressively, to further reduce energy. Aggressive supply voltage reduction requires a thorough understanding of the effect voltage scaling has on DRAM access latency and DRAM reliability. In this paper, we take a comprehensive approach to understanding and exploiting the latency and reliability characteristics of modern DRAM when the supply voltage is lowered below the nominal voltage level specified by manufacturers.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126610465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 5: Towards Efficient and Durable Storage","authors":"B. Urgaonkar","doi":"10.1145/3248540","DOIUrl":"https://doi.org/10.1145/3248540","url":null,"abstract":"","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114259810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduling Coflows in Datacenter Networks: Improved Bound for Total Weighted Completion Time","authors":"Mehrnoosh Shafiee, Javad Ghaderi","doi":"10.1145/3078505.3078548","DOIUrl":"https://doi.org/10.1145/3078505.3078548","url":null,"abstract":"Coflow is a recently proposed networking abstraction to capture communication patterns in data-parallel computing frameworks. We consider the problem of efficiently scheduling coflows with release dates in a shared datacenter network so as to minimize the total weighted completion time of coflows. Specifically, we propose a randomized algorithm with approximation ratio of 3e ~ 8.155, which improves the prior best known ratio of 9+16 √2/3 ~ 16.542. For the special case when all coflows are released at time zero, we obtain a randomized algorithm with approximation ratio of 2e ~ 5.436 which improves the prior best known ratio of 3+2√2 ~ 5.828$. Simulation result using a real traffic trace is presented that shows improvement over the prior approaches.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124756275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simplex Queues for Hot-Data Download","authors":"M. Aktaş, E. Najm, E. Soljanin","doi":"10.1145/3078505.3078553","DOIUrl":"https://doi.org/10.1145/3078505.3078553","url":null,"abstract":"In distributed systems, reliable data storage is accomplished through redundancy, which has traditionally been achieved by simple replication of data across multiple nodes [6]. A special class of erasure codes, known as locally repairable codes (LRCs) [7], has started to replace replication in practice [8], as a more storage-efficient way to provide a desired reliability. It has recently been recognized, that storage redundancy can also provide fast access of stored data (see e.g. [5,9,10] and references therein). Most of these papers consider download scenarios of all jointly encoded pieces of data, and very few [11,12,14] are concerned with download of only some, possibly hot, pieces of data that are jointly encoded with those of less interest. So far, only low traffic regime has been partially addressed. In this paper, we are concerned with hot data download from systems implementing a special class of locally repairable codes, known as LRCs with availability [13,15]. We consider simplex codes, a particular subclass of LRCs with availability, because 1) they are in a certain sense optimal [2] and 2) they are minimally different from replication.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121372277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Gradient-Based Optimization: Accelerated, Distributed, Asynchronous and Stochastic","authors":"Michael I. Jordan","doi":"10.1145/3143314.3078506","DOIUrl":"https://doi.org/10.1145/3143314.3078506","url":null,"abstract":"Many new theoretical challenges have arisen in the area of gradient-based optimization for large-scale statistical data analysis, driven by the needs of applications and the opportunities provided by new hardware and software platforms. I discuss several recent results in this area, including: (1) a new framework for understanding Nesterov acceleration, obtained by taking a continuous-time, Lagrangian/Hamiltonian perspective, (2) a general theory of asynchronous optimization in multi-processor systems, (3) a computationally-efficient approach to stochastic variance reduction, (4) a primal-dual methodology for gradient-based optimization that targets communication bottlenecks in distributed systems, and (5) a discussion of how to avoid saddle-points in nonconvex optimization.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131641899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wonil Choi, M. Arjomand, Myoungsoo Jung, M. Kandemir
{"title":"Exploiting Data Longevity for Enhancing the Lifetime of Flash-based Storage Class Memory","authors":"Wonil Choi, M. Arjomand, Myoungsoo Jung, M. Kandemir","doi":"10.1145/3078505.3078527","DOIUrl":"https://doi.org/10.1145/3078505.3078527","url":null,"abstract":"This paper proposes to exploit the capability of retention time relaxation in flash memories for improving the lifetime of an SLC-based SSD. The main idea is that as a majority of I/O data in a typical workload do not need a retention time larger than a few days, we can have multiple partial program states in a cell and use every two states to store one-bit data at each time. Thus, we can store multiple bits in a cell (one bit at each time) without erasing it after each write -- that would directly translates into lifetime enhancement. The proposed scheme is called Dense-SLC (D-SLC) flash design which improves SSD lifetime by 5.1X--8.6X.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115722003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 3: Assessing Vulnerability of Large Networks","authors":"A. Wierman","doi":"10.1145/3248537","DOIUrl":"https://doi.org/10.1145/3248537","url":null,"abstract":"","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115414624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 8: Analyzing and Controlling Network Interaction","authors":"Nicolas Gast","doi":"10.1145/3248545","DOIUrl":"https://doi.org/10.1145/3248545","url":null,"abstract":"","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131137271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hieroglyph: Locally-Sufficient Graph Processing via Compute-Sync-Merge","authors":"Xiaoen Ju, H. Jamjoom, K. Shin","doi":"10.1145/3078505.3078589","DOIUrl":"https://doi.org/10.1145/3078505.3078589","url":null,"abstract":"Mainstream graph processing systems (such as Pregel [3] and PowerGraph [1]) follow the bulk synchronous parallel model. This design leads to the tight coupling of computation and communication, where no vertex can proceed to the next iteration of computation until all vertices have been processed in the current iteration and graph states have been synchronized across all hosts. This coupling of computation and communication incurs significant performance penalty. Fully decoupling computation from communication requires (i) restricted access to only local state during computation and (ii) independence of inter-host communication from computation. We call the combination of both conditions local sufficiency. Local sufficiency is not efficiently supported by state of the art. Synchronous systems, by design, do not support local sufficiency due to their intrinsic computation-communication coupling. Even systems that implement asynchronous execution only partially achieve local sufficiency. For example, PowerGraph's asynchronous mode satisfies local sufficiency by distributed scheduling.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124318450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathias Gibbens, C. Gniady, Lei Ye, Beichuan Zhang
{"title":"Hadoop on Named Data Networking: Experience and Results","authors":"Mathias Gibbens, C. Gniady, Lei Ye, Beichuan Zhang","doi":"10.1145/3078505.3078508","DOIUrl":"https://doi.org/10.1145/3078505.3078508","url":null,"abstract":"In today's data centers, clusters of servers are arranged to perform various tasks in a massively distributed manner: handling web requests, processing scientific data, and running simulations of real-world problems. These clusters are very complex, and require a significant amount of planning and administration to ensure that they perform to their maximum potential. Planning and configuration can be a long and complicated process; once completed it is hard to completely re-architect an existing cluster. In addition to planning the physical hardware, the software must also be properly configured to run on a cluster. Information such as which server is in which rack and the total network bandwidth between rows of racks constrain the placement of jobs scheduled to run on a cluster. Some software may be able to use hints provided by a user about where to schedule jobs, while others may simply place them randomly and hope for the best. Every cluster has at least one bottleneck that constrains the overall performance to less than the optimal that may be achieved on paper. One common bottleneck is the speed of the network: communication between servers in a rack may be unable to saturate their network connections, but traffic flowing between racks or rows in a data center can easily overwhelm the interconnect switches. Various network topologies have been proposed to help mitigate this problem by providing multiple paths between points in the network, but they all suffer from the same fundamental problem: it is cost-prohibitive to build a network that can provide concurrent full network bandwidth between all servers. Researchers have been working on developing new network protocols that can make more efficient use of existing network hardware through a blurring of the line between network layer and applications. One of the most well-known examples of this is Named Data Networking (NDN), a data-centric network architecture that has been in development for several years. While NDN has received significant attention for wide-area Internet, a detailed understanding of NDN benefits and challenges in the data center environment has been lacking. The Named Data Networking architecture retrieves content by names rather than connecting to specific hosts. It provides benefits such as highly efficient and resilient content distribution, which fit well to data-intensive distributed computing. This paper presents and discusses our experience in modifying Apache Hadoop, a popular MapReduce framework, to operate on an NDN network. Through this first-of-its-kind implementation process, we demonstrate the feasibility of running an existing, large, and complex piece of distributed software commonly seen in data centers over NDN. We show advantages such as simplified network code and reduced network traffic, which are beneficial in a data center environment. There are also challenges faced by NDN that are being addressed by the community, which can be magnified under dat","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124061998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}