Nilay Vaish, Thawan Kooburat, Lorenzo De Carli, K. Sankaralingam, Cristian Estan
{"title":"Experiences in Co-designing a Packet Classification Algorithm and a Flexible Hardware Platform","authors":"Nilay Vaish, Thawan Kooburat, Lorenzo De Carli, K. Sankaralingam, Cristian Estan","doi":"10.1109/ANCS.2011.35","DOIUrl":"https://doi.org/10.1109/ANCS.2011.35","url":null,"abstract":"Algorithmic solutions to the packet classification problem in network equipment have long been a subject of study in academia and industry and with increases in network speeds they are becoming even more important. Since general purpose processors cannot meet performance and cost requirements, researchers have been assuming that ASICs or FPGAs are necessary for hardware implementation. Industry and academia have been working on SRAM-based platforms specialized for tables used in network equipment, but existing publications only describe the mapping of simpler exact match or prefix match lookups to such platforms. In this paper we adopt a software-hardware co-design approach mapping the EffiCuts algorithm to the PLUG platform. Our work confirms that this solution achieves high throughput (142 million packets per second) and low power (3.1 Watts). It identifies and evaluates changes to the original algorithm and to the platform that can improve throughput and memory utilization.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114410374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"400 Gb/s Programmable Packet Parsing on a Single FPGA","authors":"Michael Attig, G. Brebner","doi":"10.1109/ANCS.2011.12","DOIUrl":"https://doi.org/10.1109/ANCS.2011.12","url":null,"abstract":"Packet parsing is necessary at all points in the modern networking infrastructure, to support packet classification and security functions, as well as for protocol implementation. Increasingly high line rates call for advanced hardware packet processing solutions, while increasing rates of change call for high-level programmability of these solutions. This paper presents an approach for harnessing modern Field Programmable Gate Array (FPGA) devices, which are a natural technology for implementing the necessary high-speed programmable packet processing. The paper introduces PP: a simple high-level language for describing packet parsing algorithms in an implementation-independent manner. It demonstrates that this language can be compiled to give high-speed FPGA-based packet parsers that can be integrated alongside other packet processing components to build network nodes. Compilation involves generating virtual processing architectures tailored to specific packet parsing requirements. Scalability of these architectures allows parsing at line rates from 1 to 400 Gb/s as required in different network contexts. Run-time programmability of these architectures allows dynamic updating of parsing algorithms during operation in the field. Implementation results show that programmable packet parsing of 600 million small packets per second can be supported on a single Xilinx Virtex-7 FPGA device handling a 400 Gb/s line rate.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128969181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Scalability Study of Enterprise Network Architectures","authors":"Brent E. Stephens, A. Cox, S. Rixner, T. Ng","doi":"10.1109/ANCS.2011.28","DOIUrl":"https://doi.org/10.1109/ANCS.2011.28","url":null,"abstract":"The largest enterprise networks already contain hundreds of thousands of hosts. Enterprise networks are composed of Ethernet subnets interconnected by IP routers. These routers require expensive configuration and maintenance. If the Ethernet subnets are made more scalable, the high cost of the IP routers can be eliminated. Unfortunately, it has been widely acknowledged that Ethernet does not scale well because it relies on broadcast, which wastes bandwidth, and a cycle-free topology, which poorly distributes load and forwarding state. There are many recent proposals to replace Ethernet, each with its own set of architectural mechanisms. These mechanisms include eliminating broadcasts, using source routing, and restricting routing paths. Although there are many different proposed designs, there is little data available that allows for comparisons between designs. This study performs simulations to evaluate all of the factors that affect the scalability of Ethernet together, which has not been done in any of the proposals. The simulations demonstrate that, in a realistic environment, source routing reduces the maximum state requirements of the network by over an order of magnitude. About the same level of traffic engineering achieved by load-balancing all the flows at the TCP/UDP flow granularity is possible by routing only the heavy flows at the TCP/UDP granularity. Additionally, requiring routing restrictions, such as deadlock-freedom or minimum-hop routing, can significantly reduce the network's ability to perform traffic engineering across the links.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133391256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Measurement of Name-Centric Content Distribution Methods","authors":"Haowei Yuan, P. Crowley","doi":"10.1109/ANCS.2011.43","DOIUrl":"https://doi.org/10.1109/ANCS.2011.43","url":null,"abstract":"The recently proposed Named Data Networking (NDN) architecture and the widely deployed HTTP infrastructure both support content distribution in a name-centric fashion. In this paper, we evaluated the content distribution performance of NDN-based and HTTP-based content distribution solutions.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134618247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Unnikrishnan, Shiting Lu, Lixin Gao, R. Tessier
{"title":"ReClick - A Modular Dataplane Design Framework for FPGA-Based Network Virtualization","authors":"D. Unnikrishnan, Shiting Lu, Lixin Gao, R. Tessier","doi":"10.1109/ANCS.2011.31","DOIUrl":"https://doi.org/10.1109/ANCS.2011.31","url":null,"abstract":"Network virtualization has emerged as a powerful technique to deploy novel services and experimental protocols over shared network infrastructures. Although recent research has highlighted field programmable gate arrays (FPGAs) as attractive platforms for high performance network virtualization, these devices remain inaccessible to the larger networking research community due to the absence of user-friendly programming models. A programming model that can abstract the intricacies of the hardware platform while being aware of the underlying resource constraints is highly desirable. In this paper, we present ReClick, a framework to efficiently design and deploy reconfigurable data planes for FPGA-based network virtualization systems. A hardware-agnostic programming model is described that allows developers to focus on the virtual data plane semantics rather than the implementation details. The framework exposes interfaces similar to the popular software router development framework, Click, and promotes design reuse. Optimization strategies are included in ReClick which use similarities between virtual data plane configurations to implement multiple planes in an area-efficient manner. Data planes exhibiting up to 1 Gbps data rate have been automatically compiled and tested in hardware in a Net FPGA platform.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125853340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Filtering Predicates Composition with Finite State Automata","authors":"Marco Leogrande, L. Ciminiera, Fulvio Risso","doi":"10.1109/ANCS.2011.24","DOIUrl":"https://doi.org/10.1109/ANCS.2011.24","url":null,"abstract":"Network virtualization has gained a lot of attention recently, because of some new interesting proposals in the field (i.e. Open Flow). This trend has had the effect of pushing some filtering operations up at the software level: i.e. extract a potentially large number of protocol fields from a packet, or dynamically combine different filters. The time constraints of working at line rate force the creation of a packet filter model that can guarantee the minimum number of packet checks. This poster proposes mp FSA, a packet filter model based on the Finite State Automata formalism, that aims at achieving optimality w.r.t. the number of packet accesses, without sacrificing efficiency and scalability.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122310321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"File-Aware P2P Traffic Classification and Management","authors":"Zhou Zhou, Tian Song","doi":"10.1109/ANCS.2011.39","DOIUrl":"https://doi.org/10.1109/ANCS.2011.39","url":null,"abstract":"As P2P dominates Internet traffic in recent years, ISPs are striving to balance between providing the basic networking service for P2P users and properly managing network bandwidth usage. However, current P2P traffic management strategies are unable to satisfy both requirements. A file-aware P2P traffic classification method is presented in this paper. It can identify a file and the associated flows. Based on the file-level concurrent flow information, ISPs can adopt more efficient and flexible strategies to manage P2P traffic. We offered two alternatives: limiting the aggregated bandwidth consumption or the number of concurrent flows that a peer can use to download a particular file.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122418534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiachen Chen, M. Arumaithurai, Lei Jiao, Xiaoming Fu, K. Ramakrishnan
{"title":"COPSS: An Efficient Content Oriented Publish/Subscribe System","authors":"Jiachen Chen, M. Arumaithurai, Lei Jiao, Xiaoming Fu, K. Ramakrishnan","doi":"10.1109/ANCS.2011.27","DOIUrl":"https://doi.org/10.1109/ANCS.2011.27","url":null,"abstract":"Content-Centric Networks (CCN) provide substantial flexibility for users to obtain information without regard to the source of the information or its current location. Publish/subscribe (pub/sub) systems have gained popularity in society to provide the convenience of removing the temporal dependency of the user having to indicate an interest each time he or she wants to receive a particular piece of related information. Currently, on the Internet, such pub/sub systems have been built on top of an IP-based network with the additional responsibility placed on the end-systems and servers to do the work of getting a piece of information to interested recipients. We propose Content-Oriented Pub/Sub System (COPSS) to achieve an efficient pub/sub capability for CCN. COPSS enhances the heretofore inherently pull-based CCN architectures proposed by integrating a push based multicast capability at the content-centric layer. We emulate an application that is particularly emblematic of a pub/sub environment -- Twitter -- but one where subscribers are interested in content (e.g., identified by keywords), rather than tweets from a particular individual. Using trace-driven simulation, we demonstrate that our architecture can achieve a scalable and efficient content centric pub/sub network. The simulator is parameterized using the results of careful micro benchmarking of the open source CCN implementation and of standard IP based forwarding. Our evaluations show that COPSS provides considerable performance improvements in terms of aggregate network load, publisher load and subscriber experience compared to that of a traditional IP infrastructure.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114220295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HPC-Mesh: A Homogeneous Parallel Concentrated Mesh for Fault-Tolerance and Energy Savings","authors":"Jesús Camacho Villanueva, J. Flich","doi":"10.1109/ANCS.2011.17","DOIUrl":"https://doi.org/10.1109/ANCS.2011.17","url":null,"abstract":"We present the Homogeneous-Parallel-Concentrated-Mesh topology (HPC-Mesh). This NoC topology provides four disjoint homogeneous concentrated mesh networks. The network interface at each core provides connectivity to all these networks by using a novel injection algorithm. Indeed, the topology is dynamically adjusted to the working conditions of the network, minimizing power consumption by using only part of the network for low traffic rates and maximizing performance for high traffic rates by using all the networks. Therefore, the HPC-Mesh is able to adjust itself depending on the traffic demand through an intelligent injection algorithm. We perform comparison against other topologies (always using power and clock gating) with both synthetic traffic and real applications within a complete simulated system. Compared to the 2D-Mesh, on average, we reduce both the execution time by 14% and the energy consumption by 22% in real applications when using 16 cores and up to 24% in execution time and 11% in energy consumption when using 32 cores. Besides, the new topology provides a superior fault tolerance degree. It is able to work when failing up to 3 sub-networks. The extension of the topology to 3D stacked chips is also provided, exhibiting a low and practical resource overhead.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134354161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Canonical Multicore Architecture for Network Routers","authors":"Sabina Grover, A. Dhanotia, G. Byrd","doi":"10.1109/ANCS.2011.30","DOIUrl":"https://doi.org/10.1109/ANCS.2011.30","url":null,"abstract":"There has been a significant increase in the Internet dynamics in the past decade. This has put tremendous pressure on the performance of routing protocols as they need to keep updating their routing information with every network change across the globe. With the growth of Internet, Border Gateway Protocol (BGP) has become a critical routing application. Good performance of BGP on network processors directly translates to better convergence time for route changes on the Internet, leading to reduced data loss on the network. BGP is the ubiquitous routing protocol on the Internet core, and hence analyzing its performance and exploring avenues for speeding it up can greatly help in improving the responsiveness and reliability of the Internet. In this paper, we investigate the use of multicore as the compute platform for routing protocols using BGP as a representative application. We discuss two different schemes for parallelizing BGP and analyze the performance of both serial and parallel BGP implementations on a fully configurable multicore simulation environment. Subsequently, we analyze the architectural bottlenecks in the conventional multicore systems which limit the speedup that can be achieved by software parallelism alone, and propose a canonical multicore architecture for routing protocols, which can be used for future routing processor designs. The analysis and proposed schemes in this paper would greatly help in understanding the behavior of BGP, thereby assisting in design and development of next generation network processors.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"8 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120935891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}