Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)最新文献

Correct-by-Construction Network Programming for Stateful Data-Planes 有状态数据平面的构造校正网络编程

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483362

Jedidiah McClurg

{"title":"Correct-by-Construction Network Programming for Stateful Data-Planes","authors":"Jedidiah McClurg","doi":"10.1145/3482898.3483362","DOIUrl":"https://doi.org/10.1145/3482898.3483362","url":null,"abstract":"As switch hardware becomes faster, more stateful, and more programmable, functionality that was once confined to end hosts or the control plane is being pushed into the data plane. For example, recent work on adaptive congestion control and heavy hitter detection uses stateful switches to implement sophisticated functionality with only minor controller involvement. In applications where correctness depends on individual switches making coherent decisions, it is important that the switches have a consistent view of global state. However, such a consistency requirement makes it difficult to maintain efficiency (high throughput), due to the CAP theorem. Moreover, previous work on data-plane programming provides little to no built-in support for addressing this difficulty. We propose Callback State Machines(CSMs), a new high-level declarative network programming abstraction which allows operators to write correct data-plane programs against global state. CSMs offer programmers useful consistency guarantees without the need to manage how global state is replicated/updated at the individual switch level. To aid in the implementation of this high-level programming framework, we present a flexible new intermediate representation (IR) called TAPIR that natively supports stateful data plane functionality, as well as a compiler to generate device-specific code such as P4 from TAPIR code. Additionally, we demonstrate the power of TAPIR itself by using it to build a working implementation of the CONGA congestion control system.","PeriodicalId":161157,"journal":{"name":"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121027953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Helix: Traffic Engineering for Multi-Controller SDN Helix:多控制器SDN流量工程

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483354

Nicu Florin Zaicu, M. Luckie, R. Nelson, M. Barcellos

引用次数: 0

Helix 螺旋

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1007/springerreference_15706

Nicu Florin Zaicu, M. Luckie, Richard Nelson, M. Barcellos

引用次数: 0

P4 Weaver: Supporting Modular and Incremental Programming in P4 P4 Weaver:支持P4中的模块化和增量编程

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483353

Ali Fattaholmanan, M. Baldi, Antonio Carzaniga, R. Soulé

引用次数: 1

NanoTransport

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483365

S. Arslan, Stephen Ibanez, Alex Mallery, Changhoon Kim, N. McKeown

{"title":"NanoTransport","authors":"S. Arslan, Stephen Ibanez, Alex Mallery, Changhoon Kim, N. McKeown","doi":"10.1145/3482898.3483365","DOIUrl":"https://doi.org/10.1145/3482898.3483365","url":null,"abstract":"Transport protocols can be implemented in NIC (Network Interface Card) hardware to increase throughput, reduce latency and free up CPU cycles. If the ideal transport protocol were known, the optimal implementation would be simple: bake it into fixed-function hardware. But transport layer protocols are still evolving, with innovative new algorithms proposed every year. A recent study proposed Tonic, a Verilog-programmable transport layer in hardware. We build on this work to propose a new programmable hardware transport layer architecture, called nanoTransport, optimized for the extremely low-latency message-based RPCs (Remote Procedure Calls) that dominate large, modern distributed data center applications. NanoTransport is programmed using the P4 language, making it easy to modify existing (or create entirely new) transport protocols in hardware. We identify common events and primitive operations, allowing for a streamlined, modular, programmable pipeline, including packetization, reassembly, timeouts and packet generation, all to be expressed by the programmer. We evaluate our nanoTransport prototype by programming it to run the reliable message-based transport protocols NDP and Homa, as well as a hybrid variant. Our FPGA prototype - implemented in Chisel and running on the Firesim simulator - exposes P4-programmable pipelines and is designed to run in an ASIC at 200Gb/s with each packet processed end-to-end in less than 10ns (including message reassembly).","PeriodicalId":161157,"journal":{"name":"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)","volume":"100 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116321954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Clustreams Clustreams

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483356

Roy Friedman, Or Goaz, Ori Rottenstreich

{"title":"Clustreams","authors":"Roy Friedman, Or Goaz, Ori Rottenstreich","doi":"10.1145/3482898.3483356","DOIUrl":"https://doi.org/10.1145/3482898.3483356","url":null,"abstract":"Clusteringis a basic machine learning task. In this task, a stream of input items needs to be grouped into clusters, such that all items classified into the same cluster are closer to each other than to items classified to other clusters. Each cluster is centered around a centroidpoint, which may either be given as a parameter, or must be learned during the process in the case of unsupervised online learning. This work studies the ability to perform clustering, e.g., for classifying network traffic, in programmable switches. Conducting such classification by the switches through which the traffic flows is potentially the most efficient approach. To that end, we develop Clustreams, a novel in-network clustering system designed to handle clustering in the data path. At the core of Clustreamsis a novel clustering algorithm that relies heavily on TCAM (Ternary Content Addressable Memory) match-action capabilities. This algorithm is realized for the Nvidia Spectrum-3 switch, and is limited to classification when the centroid points are known a-priori. The work includes accuracy measurements for the algorithms, as well as run-time performance measurements and analysis of the clustering algorithm on a Spectrum-3 switch. As shown in the measurements, Clustreamsobtains very high accuracy without any noticeable run-time impact on the switch' performance.","PeriodicalId":161157,"journal":{"name":"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122891056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

How Much TCAM do we Need for Splitting Traffic? 我们需要多少TCAM来分流交通?

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483367

Yaniv Sadeh, Ori Rottenstreich, Haim Kaplan

引用次数: 3

Accelerating Distributed Deep Learning using Multi-Path RDMA in Data Center Networks 在数据中心网络中使用多路径RDMA加速分布式深度学习

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483363

Feng Tian, Yang Zhang, Wei Ye, Cheng Jin, Ziyan Wu, Zhi-Li Zhang

引用次数: 4

Nimble: Scalable TCP-Friendly Programmable In-Network Rate-Limiting 灵活:可扩展的tcp友好的可编程网络限速

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483361

Vineeth Sagar Thapeta, Komal Shinde, Mojtaba MalekpourShahraki, Darius Grassi, Balajee Vamanan, Brent E. Stephens

{"title":"Nimble: Scalable TCP-Friendly Programmable In-Network Rate-Limiting","authors":"Vineeth Sagar Thapeta, Komal Shinde, Mojtaba MalekpourShahraki, Darius Grassi, Balajee Vamanan, Brent E. Stephens","doi":"10.1145/3482898.3483361","DOIUrl":"https://doi.org/10.1145/3482898.3483361","url":null,"abstract":"There is an emerging need for scalable high-performance in-networkrate-limiting because rate-limiters can be used to provide performance isolation. However, existing approaches to in-network rate-limiting are not scalable or TCP-friendly. This paper presents the design of Nimble, a new approach to in-network rate-limiting that is scalable, high performance, and TCP-friendly. Nimble uses meters to scalably provide hardware rate-limiting without any dedicated queuing or buffering resources, and Nimble uses ECN-Shaping for TCP-friendly rate-limit enforcement. Nimble also introduces the first algorithm for configuring in-network rate-limiters to enforce network-wide isolation policies. Through a P4 implementation and experiments with a 100Gbps Barefoot Tofino switch, we find that Nimble is immediately usable and can operate even with high bandwidth rate-limits without needing to recirculate packets or rely on hardware packet generators to generate token refill packets. This overcomes the scalability limitations of prior approaches. Experiments with Apache and Redis show that Nimble can reduce application-level latency by an order of magnitude when compared to not using in-network rate-limiting, and ns-3 simulations demonstrate that Nimble behaves well in larger clusters. We find that Nimble can scale to 100K rate-limiters perswitch when implemented on a Barefoot Tofino switch, and our new rate allocation algorithm reduces rate-limiter updates by a factor of 10x-24x and improves network utilization by 24%.","PeriodicalId":161157,"journal":{"name":"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132225313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Taproot: Resilient Diversity Routing with Bounded Latency Taproot:具有有限延迟的弹性分集路由

Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) Pub Date : 2021-10-11 DOI: 10.1145/3482898.3483364

Eman Ramadan, Hesham Mekky, Cheng Jin, Braulio Dumba, Zhi-Li Zhang

{"title":"Taproot: Resilient Diversity Routing with Bounded Latency","authors":"Eman Ramadan, Hesham Mekky, Cheng Jin, Braulio Dumba, Zhi-Li Zhang","doi":"10.1145/3482898.3483364","DOIUrl":"https://doi.org/10.1145/3482898.3483364","url":null,"abstract":"As we increasingly depend on networked services, ensuring resiliency of networks against network failures and providing bounded latency to applications become imperative. Adding ample redundancy in the network substrate alone is not sufficient; resilient routing mechanisms that can effectively take advantage of such topological diversity also play a critical role. In this paper, we present Taproot, a resilient diversity routing algorithmthat ensures bounded latencyfor packet delivery under failures by leveraging a preordeR@routing structure with precomputed routing rules. Leveraging the centralizedcontrol plane and programmable match-actionrules in the data plane, we describe how Taproot can be realized in SDN networks. We implement Taproot in OVS and conduct extensive simulations and experiments to demonstrate its superior performance over existing solutions. Our results show that by tuning the latency allowance upon failure, Taproot reduces/eliminates the number of disconnected src-dst pairs even under 10 link failures. Finally, as a use case, we illustrate the impact of control channel failures on SDN data plane/application performance, and employ Taproot to provide a \"hardened\" SDN control network with bounded latency against failures. Our results show that Taproot immediately detects the failure and re-routes the control messages to a different path avoiding failed links/nodes. Hence, the control channel is maintained without interruption, or involvement from the controller, and the throughput was not affected.","PeriodicalId":161157,"journal":{"name":"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115391696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2