Proceedings of the Thirteenth EuroSys Conference最新文献_第2页

Wide-area analytics with multiple resources 具有多种资源的广域分析

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190528

Chien-Chun Hung, G. Ananthanarayanan, L. Golubchik, Minlan Yu, Mingyang Zhang

引用次数: 62

Analytics with smart arrays: adaptive and efficient language-independent data 分析与智能阵列:自适应和高效的语言无关的数据

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190514

Iraklis Psaroudakis, Stefan Kaestle, Matthias Grimmer, D. Goodman, Jean-Pierre Lozi, T. Harris

引用次数: 6

Rock you like a hurricane: taming skew in large scale analytics 像飓风一样震撼你:大规模分析中的驯服偏差

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190532

Laurent Bindschaedler, Jasmina Malicevic, Nicolas Schiper, Ashvin Goel, W. Zwaenepoel

{"title":"Rock you like a hurricane: taming skew in large scale analytics","authors":"Laurent Bindschaedler, Jasmina Malicevic, Nicolas Schiper, Ashvin Goel, W. Zwaenepoel","doi":"10.1145/3190508.3190532","DOIUrl":"https://doi.org/10.1145/3190508.3190532","url":null,"abstract":"Current cluster computing frameworks suffer from load imbalance and limited parallelism due to skewed data distributions, processing times, and machine speeds. We observe that the underlying cause for these issues in current systems is that they partition work statically. Hurricane is a high-performance large-scale data analytics system that successfully tames skew in novel ways. Hurricane performs adaptive work partitioning based on load observed by nodes at runtime. Overloaded nodes can spawn clones of their tasks at any point during their execution, with each clone processing a subset of the original data. This allows the system to adapt to load imbalance and dynamically adjust task parallelism to gracefully handle skew. We support this design by spreading data across all nodes and allowing nodes to retrieve data in a decentralized way. The result is that Hurricane automatically balances load across tasks, ensuring fast completion times. We evaluate Hurricane's performance on typical analytics workloads and show that it significantly outperforms state-of-the-art systems for both uniform and skewed datasets, because it ensures good CPU and storage utilization in all cases.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128815907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

Solros: a data-centric operating system architecture for heterogeneous computing Solros:用于异构计算的以数据为中心的操作系统架构

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190523

Changwoo Min, Woon-Hak Kang, Mohan Kumar, Sanidhya Kashyap, Steffen Maass, Heeseung Jo, Taesoo Kim

{"title":"Solros: a data-centric operating system architecture for heterogeneous computing","authors":"Changwoo Min, Woon-Hak Kang, Mohan Kumar, Sanidhya Kashyap, Steffen Maass, Heeseung Jo, Taesoo Kim","doi":"10.1145/3190508.3190523","DOIUrl":"https://doi.org/10.1145/3190508.3190523","url":null,"abstract":"We propose Solros---a new operating system architecture for heterogeneous systems that comprises fast host processors, slow but massively parallel co-processors, and fast I/O devices. A general consensus to fully drive such a hardware system is to have a tight integration among processors and I/O devices. Thus, in the Solros architecture, a co-processor OS (data-plane OS) delegates its services, specifically I/O stacks, to the host OS (control-plane OS). Our observation for such a design is that global coordination with system-wide knowledge (e.g., PCIe topology, a load of each co-processor) and the best use of heterogeneous processors is critical to achieving high performance. Hence, we fully harness these specialized processors by delegating complex I/O stacks on fast host processors, which leads to an efficient global coordination at the level of the control-plane OS. We developed Solros with Xeon Phi co-processors and implemented three core OS services: transport, file system, and network services. Our experimental results show significant performance improvement compared with the stock Xeon Phi running the Linux kernel. For example, Solros improves the throughput of file system and network operations by 19x and 7x, respectively. Moreover, it improves the performance of two realistic applications: 19x for text indexing and 2x for image search.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130538756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

3Sigma: distribution-based cluster scheduling for runtime uncertainty 3Sigma:针对运行时不确定性的基于分布的集群调度

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190515

J. Park, Alexey Tumanov, Angela H. Jiang, M. Kozuch, G. Ganger

{"title":"3Sigma: distribution-based cluster scheduling for runtime uncertainty","authors":"J. Park, Alexey Tumanov, Angela H. Jiang, M. Kozuch, G. Ganger","doi":"10.1145/3190508.3190515","DOIUrl":"https://doi.org/10.1145/3190508.3190515","url":null,"abstract":"The 3Sigma cluster scheduling system uses job runtime histories in a new way. Knowing how long each job will execute enables a scheduler to more effectively pack jobs with diverse time concerns (e.g., deadline vs. the-sooner-the-better) and placement preferences on heterogeneous cluster resources. But, existing schedulers use single-point estimates (e.g., mean or median of a relevant subset of historical runtimes), and we show that they are fragile in the face of real-world estimate error profiles. In particular, analysis of job traces from three different large-scale cluster environments shows that, while the runtimes of many jobs can be predicted well, even state-of-the-art predictors have wide error profiles with 8--23% of predictions off by a factor of two or more. Instead of reducing relevant history to a single point, 3Sigma schedules jobs based on full distributions of relevant runtime histories and explicitly creates plans that mitigate the effects of anticipated runtime uncertainty. Experiments with workloads derived from the same traces show that 3Sigma greatly outperforms a state-of-the-art scheduler that uses point estimates from a state-of-the-art predictor; in fact, the performance of 3Sigma approaches the end-to-end performance of a scheduler based on a hypothetical, perfect runtime predictor. 3Sigma reduces SLO miss rate, increases cluster goodput, and improves or matches latency for best effort jobs.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"8 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121003695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

A frugal approach to reduce RCU grace period overhead 一个节约的方法来减少RCU宽限期开销

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190522

Aravinda Prasad, Kanchi Gopinath

{"title":"A frugal approach to reduce RCU grace period overhead","authors":"Aravinda Prasad, Kanchi Gopinath","doi":"10.1145/3190508.3190522","DOIUrl":"https://doi.org/10.1145/3190508.3190522","url":null,"abstract":"Grace period computation is a core part of the Read-Copy-Update (RCU) synchronization technique that determines the safe time to reclaim the deferred objects' memory. We first show that the eager grace period computation employed in the Linux kernel is appropriate only for enterprise workloads such as web and database servers where a large amount of reclaimable memory awaits the completion of a grace period. However, such memory is negligible in High-Performance Computing (HPC) and mostly idling environments due to limited OS kernel activity. Hence an eager approach is not only futile but also detrimental as the CPU cycles consumed to compute a grace period leads to jitter in HPC and frequent CPU wake-ups in idle environments. We design frugal grace periods, an economical grace period computation for non-enterprise environments that consume fewer CPU cycles. In addition, we reduce the number of grace periods either by using heuristics or by letting the memory allocator to explicitly request for a grace period only when it is running out of free objects. Our implementation in the Linux kernel reduces the number of grace periods by 68% to 99%, reduces the CPU time consumed by grace periods by 39% to 99%, improves the throughput by up to 28% for NAS parallel benchmarks and increases the CPU time spent in low power states by 2.4x when the system is idle.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128638268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Flashabacus: a self-governing flash-based accelerator for low-power systems Flashabacus:用于低功率系统的自我管理的基于flash的加速器

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190544

Jie Zhang, Myoungsoo Jung

引用次数: 11

G-Miner: an efficient task-oriented graph mining system G-Miner:一个高效的面向任务的图形挖掘系统

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190545

Hongzhi Chen, Miao Liu, Yunjian Zhao, Xiao Yan, Da Yan, James Cheng

引用次数: 86

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190542

Mainak Ghosh, Ashwini Raina, Le Xu, Xiaoyao Qian, Indranil Gupta, H. Gupta

引用次数: 5

DumbNet: a smart data center network fabric with dumb switches DumbNet:一个带有哑交换机的智能数据中心网络结构

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI: 10.1145/3190508.3190531

Yiran Li, Da Wei, Xiaoqi Chen, Ziheng Song, Ruihan Wu, Yuxing Li, Xin Jin, W. Xu

引用次数: 10