DAIET

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI:10.1145/3127479.3132018

Amedeo Sapio, I. Abdelaziz, Marco Canini, Panos Kalnis

{"title":"DAIET","authors":"Amedeo Sapio, I. Abdelaziz, Marco Canini, Panos Kalnis","doi":"10.1145/3127479.3132018","DOIUrl":null,"url":null,"abstract":"1 CONTEXT AND MOTIVATION Many data center applications nowadays rely on distributed computation models like MapReduce and Bulk Synchronous Parallel (BSP) for data-intensive computation at scale [4]. These models scale by leveraging the partition/aggregate pattern where data and computations are distributed across many worker servers, each performing part of the computation. A communication phase is needed each time workers need to synchronize the computation and, at last, to produce the final output. In these applications, the network communication costs can be one of the dominant scalability bottlenecks especially in case of multi-stage or iterative computations [1]. The advent of flexible networking hardware and expressive data plane programming languages have produced networks that are deeply programmable [2]. This creates the opportunity to co-design distributed systems with their network layer, which can offer substantial performance benefits. A possible use of this emerging technology is to execute the logic traditionally associated with the application layer into the network itself. Given that in the above mentioned applications the intermediate results are necessarily exchanged through the network, it is desirable to offload to it part of the aggregation task to reduce the traffic and lessen the work of the servers. However, these programmable networking devices typically have very stringent constraints on the number and type of operations that can be performed at line rate. Moreover, packet processing at high speed requires a very fast memory, such as TCAM or SRAM, which is expensive and usually available in small capacities.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 Symposium on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3127479.3132018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

1 CONTEXT AND MOTIVATION Many data center applications nowadays rely on distributed computation models like MapReduce and Bulk Synchronous Parallel (BSP) for data-intensive computation at scale [4]. These models scale by leveraging the partition/aggregate pattern where data and computations are distributed across many worker servers, each performing part of the computation. A communication phase is needed each time workers need to synchronize the computation and, at last, to produce the final output. In these applications, the network communication costs can be one of the dominant scalability bottlenecks especially in case of multi-stage or iterative computations [1]. The advent of flexible networking hardware and expressive data plane programming languages have produced networks that are deeply programmable [2]. This creates the opportunity to co-design distributed systems with their network layer, which can offer substantial performance benefits. A possible use of this emerging technology is to execute the logic traditionally associated with the application layer into the network itself. Given that in the above mentioned applications the intermediate results are necessarily exchanged through the network, it is desirable to offload to it part of the aggregation task to reduce the traffic and lessen the work of the servers. However, these programmable networking devices typically have very stringent constraints on the number and type of operations that can be performed at line rate. Moreover, packet processing at high speed requires a very fast memory, such as TCAM or SRAM, which is expensive and usually available in small capacities.

查看原文本刊更多论文

如今，许多数据中心应用依赖于分布式计算模型，如MapReduce和Bulk Synchronous Parallel (BSP)进行大规模的数据密集型计算[4]。这些模型通过利用分区/聚合模式进行扩展，其中数据和计算分布在许多工作服务器上，每个工作服务器执行部分计算。每次工作者需要同步计算并最终产生最终输出时，都需要一个通信阶段。在这些应用中，网络通信成本可能是主要的可扩展性瓶颈之一，特别是在多阶段或迭代计算的情况下[1]。灵活的网络硬件和富有表现力的数据平面编程语言的出现产生了深度可编程的网络[2]。这创造了与网络层共同设计分布式系统的机会，这可以提供实质性的性能优势。这种新兴技术的一种可能用途是将传统上与应用层相关的逻辑执行到网络本身。鉴于在上述应用程序中，中间结果必须通过网络进行交换，因此希望将部分聚合任务卸载给网络，以减少流量并减少服务器的工作。然而，这些可编程网络设备通常对可以以线速率执行的操作的数量和类型有非常严格的限制。此外，高速分组处理需要非常快的存储器，如TCAM或SRAM，这是昂贵的，通常在小容量可用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2017 Symposium on Cloud Computing

自引率

0.00%

发文量