Proceedings of the Thirteenth EuroSys Conference最新文献

Riffle: optimized shuffle service for large-scale data analytics Riffle:针对大规模数据分析优化的洗牌服务

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190534

Haoyu Zhang, Brian Cho, Ergin Seyfe, A. Ching, M. Freedman

{"title":"Riffle: optimized shuffle service for large-scale data analytics","authors":"Haoyu Zhang, Brian Cho, Ergin Seyfe, A. Ching, M. Freedman","doi":"10.1145/3190508.3190534","DOIUrl":"https://doi.org/10.1145/3190508.3190534","url":null,"abstract":"The rapidly growing size of data and complexity of analytics present new challenges for large-scale data processing systems. Modern systems keep data partitions in memory for pipelined operators, and persist data across stages with wide dependencies on disks for fault tolerance. While processing can often scale well by splitting jobs into smaller tasks for better parallelism, all-to-all data transfer---called shuffle operations---become the scaling bottleneck when running many small tasks in multi-stage data analytics jobs. Our key observation is that this bottleneck is due to the superlinear increase in disk I/O operations as data volume increases. We present Riffle, an optimized shuffle service for big-data analytics frameworks that significantly improves I/O efficiency and scales to process petabytes of data. To do so, Riffle efficiently merges fragmented intermediate shuffle files into larger block files, and thus converts small, random disk I/O requests into large, sequential ones. Riffle further improves performance and fault tolerance by mixing both merged and unmerged block files to minimize merge operation overhead. Using Riffle, Facebook production jobs on Spark clusters with over 1,000 executors experience up to a 10x reduction in the number of shuffle I/O requests and 40% improvement in the end-to-end job completion time.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124252147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

Model-driven computational sprinting 模型驱动的计算冲刺

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190543

Nathaniel Morris, Christopher Stewart, L. Chen, R. Birke, Jaimie Kelley

{"title":"Model-driven computational sprinting","authors":"Nathaniel Morris, Christopher Stewart, L. Chen, R. Birke, Jaimie Kelley","doi":"10.1145/3190508.3190543","DOIUrl":"https://doi.org/10.1145/3190508.3190543","url":null,"abstract":"Computational sprinting speeds up query execution by increasing power usage for short bursts. Sprinting policy decides when and how long to sprint. Poor policies inflate response time significantly. We propose a model-driven approach that chooses between sprinting policies based on their expected response time. However, sprinting alters query executions at runtime, creating a complex dependency between queuing and processing time. Our performance modeling approach employs offline profiling, machine learning, and first-principles simulation. Collectively, these modeling techniques capture the effects of sprinting on response time. We validated our modeling approach with 3 sprinting mechanisms across 9 workloads. Our performance modeling approach predicted response time with median error below 4% in most tests and median error of 11% in the worst case. We demonstrated model-driven sprinting for cloud providers seeking to colocate multiple workloads on AWS Burstable Instances while meeting service level objectives. Model-driven sprinting uncovered policies that achieved response time goals, allowing more workloads to colocate on a node. Compared to AWS Burstable policies, our approach increased revenue per node by 1.6X.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125479542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Scrub: online troubleshooting for large mission-critical applications Scrub:用于大型关键任务应用程序的在线故障排除

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190513

A. Satish, Thomas Shiou, Chuck Zhang, Khaled Elmeleegy, W. Zwaenepoel

{"title":"Scrub: online troubleshooting for large mission-critical applications","authors":"A. Satish, Thomas Shiou, Chuck Zhang, Khaled Elmeleegy, W. Zwaenepoel","doi":"10.1145/3190508.3190513","DOIUrl":"https://doi.org/10.1145/3190508.3190513","url":null,"abstract":"Scrub is a troubleshooting tool for distributed applications that operate under strict SLOs common in production environments. It allows users to formulate queries on events occurring during execution in order to assess the correctness of the application's operation. Scrub has been in use for two years at Turn, where developers and users have relied on it to resolve numerous issues in its online advertisement bidding platform. This platform spans thousands of machines across the globe, serving several million bid requests per second, and dispensing many millions of dollars in advertising budgets. Troubleshooting distributed applications is notoriously hard, and its difficulty is exacerbated by the presence of strict SLOs, which requires the troubleshooting tool to have only minimal impact on the hosts running the application. Furthermore, with large amounts of money at stake, users expect to be able to run frequent diagnostics and demand quick evaluation and remediation of any problems. These constraints have led to a number of design and implementation decisions, that go counter to conventional wisdom. In particular, Scrub supports only a restricted form of joins. Its query execution strategy eschews imposing any overhead on the application hosts. In particular, joins, group-by operations and aggregations are sent to a dedicated centralized facility. In terms of implementation, Scrub avoids the overhead and security concerns of dynamic instrumentation. Finally, at all levels of the system, accuracy is traded for minimal impact on the hosts. We present the design and implementation of Scrub and contrast its choices to those made in earlier systems. We illustrate its power by describing a number of use cases, and we demonstrate its negligible overhead on the underlying application. On average, we observe a maximum CPU overhead of up to 2.5% on application hosts and a 1% increase in request latency. These overheads allow the advertisement bidding platform to operate well within its SLOs.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129356087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Dynamic control flow in large-scale machine learning 大规模机器学习中的动态控制流

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190551

Yuan Yu, Martín Abadi, P. Barham, E. Brevdo, M. Burrows, Andy Davis, J. Dean, S. Ghemawat, Tim Harley, Peter Hawkins, M. Isard, M. Kudlur, R. Monga, D. Murray, Xiaoqiang Zheng

{"title":"Dynamic control flow in large-scale machine learning","authors":"Yuan Yu, Martín Abadi, P. Barham, E. Brevdo, M. Burrows, Andy Davis, J. Dean, S. Ghemawat, Tim Harley, Peter Hawkins, M. Isard, M. Kudlur, R. Monga, D. Murray, Xiaoqiang Zheng","doi":"10.1145/3190508.3190551","DOIUrl":"https://doi.org/10.1145/3190508.3190551","url":null,"abstract":"Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations. We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"1949 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129116620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 96

Service fabric: a distributed platform for building microservices in the cloud 服务结构:用于在云中构建微服务的分布式平台

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190546

Gopal Kakivaya, Lu Xun, Richard Hasha, S. Ahsan, Todd Pfleiger, R. Sinha, Anurag Gupta, Mihail Tarta, M. Fussell, Vipul Modi, M. Mohsin, Ray Kong, Anmol Ahuja, Oana Platon, Alex Wun, Matthew Snider, Chacko Daniel, Dan Mastrian, Yang Li, A. Rao, Vaishnav Kidambi, Randy Wang, A. Ram, S. Shivaprakash, R. Nair, Alan Warwick, Bharat S. Narasimman, Meng-Jang Lin, Jeffrey Chen, Abhay Balkrishna Mhatre, Preetha Subbarayalu, M. Coskun, Indranil Gupta

引用次数: 48

DCAPS: dynamic cache allocation with partial sharing DCAPS:带有部分共享的动态缓存分配

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190511

Yaocheng Xiang, Xiaolin Wang, Zihui Huang, Zeyu Wang, Yingwei Luo, Zhenlin Wang

{"title":"DCAPS: dynamic cache allocation with partial sharing","authors":"Yaocheng Xiang, Xiaolin Wang, Zihui Huang, Zeyu Wang, Yingwei Luo, Zhenlin Wang","doi":"10.1145/3190508.3190511","DOIUrl":"https://doi.org/10.1145/3190508.3190511","url":null,"abstract":"In a multicore system, effective management of shared last level cache (LLC), such as hardware/software cache partitioning, has attracted significant research attention. Some eminent progress is that Intel introduced Cache Allocation Technology (CAT) to its commodity processors recently. CAT implements way partitioning and provides software interface to control cache allocation. Unfortunately, CAT can only allocate at way level, which does not scale well for a large thread or program count to serve their various performance goals effectively. This paper proposes Dynamic Cache Allocation with Partial Sharing (DCAPS), a framework that dynamically monitors and predicts a multi-programmed workload's cache demand, and reallocates LLC given a performance target. Further, DCAPS explores partial sharing of a cache partition among programs and thus practically achieves cache allocation at a finer granularity. DCAPS consists of three parts: (1) Online Practical Miss Rate Curve (OPMRC), a low-overhead software technique to predict online miss rate curves (MRCs) of individual programs of a workload; (2) a prediction model that estimates the LLC occupancy of each individual program under any CAT allocation scheme; (3) a simulated annealing algorithm that searches for a near-optimal CAT scheme given a specific performance goal. Our experimental results show that DCAPS is able to optimize for a wide range of performance targets and can scale to a large core count.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127881818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 55

Medea: scheduling of long running applications in shared production clusters Medea:在共享生产集群中调度长时间运行的应用程序

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190549

Panagiotis Garefalakis, Konstantinos Karanasos, P. Pietzuch, Arun Suresh, Sriram Rao

{"title":"Medea: scheduling of long running applications in shared production clusters","authors":"Panagiotis Garefalakis, Konstantinos Karanasos, P. Pietzuch, Arun Suresh, Sriram Rao","doi":"10.1145/3190508.3190549","DOIUrl":"https://doi.org/10.1145/3190508.3190549","url":null,"abstract":"The rise in popularity of machine learning, streaming, and latency-sensitive online applications in shared production clusters has raised new challenges for cluster schedulers. To optimize their performance and resilience, these applications require precise control of their placements, by means of complex constraints, e.g., to collocate or separate their long-running containers across groups of nodes. In the presence of these applications, the cluster scheduler must attain global optimization objectives, such as maximizing the number of deployed applications or minimizing the violated constraints and the resource fragmentation, but without affecting the scheduling latency of short-running containers. We present Medea, a new cluster scheduler designed for the placement of long- and short-running containers. Medea introduces powerful placement constraints with formal semantics to capture interactions among containers within and across applications. It follows a novel two-scheduler design: (i) for long-running containers, it applies an optimization-based approach that accounts for constraints and global objectives; (ii) for short-running containers, it uses a traditional task-based scheduler for low placement latency. Evaluated on a 400-node cluster, our implementation of Medea on Apache Hadoop YARN achieves placement of long-running applications with significant performance and resilience benefits compared to state-of-the-art schedulers.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124005712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 89

dCat: dynamic cache management for efficient, performance-sensitive infrastructure-as-a-service dCat:用于高效、性能敏感的基础设施即服务的动态缓存管理

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190555

Cong Xu, K. Rajamani, Alexandre Ferreira, Wes Felter, J. Rubio, Y. Li

{"title":"dCat: dynamic cache management for efficient, performance-sensitive infrastructure-as-a-service","authors":"Cong Xu, K. Rajamani, Alexandre Ferreira, Wes Felter, J. Rubio, Y. Li","doi":"10.1145/3190508.3190555","DOIUrl":"https://doi.org/10.1145/3190508.3190555","url":null,"abstract":"In the modern multi-tenant cloud, resource sharing increases utilization but causes performance interference between tenants. More generally, performance isolation is also relevant in any multi-workload scenario involving shared resources. Last level cache (LLC) on processors is shared by all CPU cores in x86, thus the cloud tenants inevitably suffer from the cache flush by their noisy neighbors running on the same socket. Intel Cache Allocation Technology (CAT) provides a mechanism to assign cache ways to cores to enable cache isolation, but its static configuration can result in underutilized cache when a workload cannot benefit from its allocated cache capacity, and/or lead to sub-optimal performance for workloads that do not have enough assigned capacity to fit their working set. In this work, we propose a new dynamic cache management technology (dCat) to provide strong cache isolation with better performance. For each workload, we target a consistent, minimum performance bound irrespective of others on the socket and dependent only on its rightful share of the LLC capacity. In addition, when there is spare capacity on the socket, or when some workloads are not obtaining beneficial performance from their cache allocation, dCat dynamically reallocates cache space to cache-intensive workloads. We have implemented dCat in Linux on top of CAT to dynamically adjust cache mappings. dCat requires no modifications to applications so that it can be applied to all cloud workloads. Based on our evaluation, we see an average of 25% improvement over shared cache and 15.7% over static CAT for selected, memory intensive, SPEC CPU2006 workloads. For typical cloud workloads, with Redis we see 57.6% improvement (over shared LLC) and 26.6% improvement (over static partition) and with ElasticSearch we see 11.9% improvement over both.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"8 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129881721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

Hang doctor: runtime detection and diagnosis of soft hangs for smartphone apps 挂起医生:智能手机应用软挂的运行时检测和诊断

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190525

Marco Brocanelli, Xiaorui Wang

{"title":"Hang doctor: runtime detection and diagnosis of soft hangs for smartphone apps","authors":"Marco Brocanelli, Xiaorui Wang","doi":"10.1145/3190508.3190525","DOIUrl":"https://doi.org/10.1145/3190508.3190525","url":null,"abstract":"A critical quality factor for smartphone apps is responsiveness, which indicates how fast an app reacts to user actions. A soft hang occurs when the app's response time of handling a certain user action is longer than a user-perceivable delay. Soft hangs can be caused by normal User Interface (UI) rendering or some blocking operations that should not be conducted on the app's main thread (i.e., soft hang bugs). Existing solutions on soft hang bug detection focus mainly on offline app code examination to find previously known blocking operations and then move them off the main thread. Unfortunately, such offline solutions can fail to identify blocking operations that are previously unknown or hidden in libraries. In this paper, we present Hang Doctor, a runtime methodology that supplements the existing offline algorithms by detecting and diagnosing soft hangs caused by previously unknown blocking operations. Hang Doctor features a two-phase algorithm that first checks response time and performance event counters for detecting possible soft hang bugs with small overheads, and then performs stack trace analysis when diagnosis is necessary. A novel soft hang filter based on correlation analysis is designed to minimize false positives and negatives for high detection performance and low overhead. We have implemented a prototype of Hang Doctor and tested it with the latest releases of 114 real-world apps. Hang Doctor has identified 34 new soft hang bugs that are previously unknown to their developers, among which 62%, so far, have been confirmed by the developers, and 68% are missed by offline algorithms.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114522767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Delta pointers: buffer overflow checks without the checks 增量指针:没有检查的缓冲区溢出检查

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190553

Taddeus Kroes, Koen Koning, E. V. D. Kouwe, H. Bos, Cristiano Giuffrida

{"title":"Delta pointers: buffer overflow checks without the checks","authors":"Taddeus Kroes, Koen Koning, E. V. D. Kouwe, H. Bos, Cristiano Giuffrida","doi":"10.1145/3190508.3190553","DOIUrl":"https://doi.org/10.1145/3190508.3190553","url":null,"abstract":"Despite decades of research, buffer overflows still rank among the most dangerous vulnerabilities in unsafe languages such as C and C++. Compared to other memory corruption vulnerabilities, buffer overflows are both common and typically easy to exploit. Yet, they have proven so challenging to detect in real-world programs that existing solutions either yield very poor performance, or introduce incompatibilities with the C/C++ language standard. We present Delta Pointers, a new solution for buffer overflow detection based on efficient pointer tagging. By carefully altering the pointer representation, without violating language specifications, Delta Pointers use existing hardware features to detect both contiguous and non-contiguous overflows on dereferences, without a single check incurring extra branch or memory access operations. By focusing on buffer overflows rather than other vulnerabilities (e.g., underflows), Delta Pointers offer a unique checkless design to provide high performance while still maintaining compatibility. We show that Delta Pointers are effective in detecting arbitrary buffer overflows and, at 35% overhead on SPEC, offer much better performance than competing solutions.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126451660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46