{"title":"Designing packet buffers with statistical guarantees","authors":"G. Shrimali, I. Keslassy, N. McKeown","doi":"10.1109/CONECT.2004.1375202","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375202","url":null,"abstract":"Packet buffers are an essential part of routers. In high-end routers, these buffers need to store a large amount of data at very high speeds. To satisfy these requirements, we need a memory with the the speed of SRAM and the density of DRAM. A typical solution is to use hybrid packet buffers built from a combination of SRAM and DRAM, where the SRAM holds the heads and tails of per-flow packet FIFOs and the DRAM is used for bulk storage. The main challenge then is to minimize the size of the SRAM while providing reasonable performance guarantees. We analyze a commonly used hybrid architecture from a statistical perspective, and investigate how small the SRAM can get if the packet buffer designer is willing to tolerate a certain drop probability. We introduce an analytical model to represent the SRAM buffer occupancy, and derive drop probabilities as a function of SRAM size under a wide range of statistical traffic patterns. By our analysis, we show that, for low drop probability, the required SRAM size is proportional to the number of flows.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123028987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-random generator for IPv6 tables","authors":"Mei Wang, S. Deering, T. Hain, L. Dunn","doi":"10.1109/CONECT.2004.1375198","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375198","url":null,"abstract":"The next generation Internet Protocol, IPv6, has attracted growing attention. The characteristics of future IPv6 routing tables play a key role in router architecture and network design. In order to design and analyze efficient and scalable IP lookup algorithms for IPv6, IPv6 routing tables are needed. Analysis of existing IPv4 tables shows that there is an underlying structure that differs greatly from random distributions. Since there are few users on IPv6 at present, current IPv6 table sizes are small and unlikely to reflect future IPv6 network growth. Thus, neither randomly generated tables nor current IPv6 tables are good benchmarks for analysis. More representative IPv6 lookup tables are needed for the development of IPv6 routers. From an analysis of current IPv4 tables, algorithms are proposed for generating IPv6 lookup tables. Tables generated by the suggested methods exhibit certain features characteristic of real lookup tables, reflecting not only new IPv6 address allocation schemes but also patterns common to IPv4 tables. These tables provide useful research tools by a better representation of future lookup tables as IPv6 becomes more widely deployed.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":" 29","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120833363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient multi-match packet classification with TCAM","authors":"Fang Yu, R. Katz","doi":"10.1109/CONECT.2004.1375197","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375197","url":null,"abstract":"Today's packet classification systems are designed to provide the highest priority matching result, e.g., the longest prefix match, even if a packet matches multiple classification rules. However, new network applications, such as intrusion detection systems, require information about all the matching results. We call this the multi-match classification problem. In several complex network applications, multi-match classification is immediately followed by other processing dependent on the classification results. Therefore, classification should be even faster than the line rate. Pure software solutions cannot be used due to their slow speeds. We present a solution based on ternary content addressable memory (TCAM), which produces multi-match classification results with only one TCAM lookup and one SRAM lookup per packet - about ten times fewer memory lookups than a pure software approach. In addition, we present a scheme to remove the negation format in rule sets, which can save up to 95% of TCAM space compared with the straight forward solution. We show that using our pre-processing scheme, header processing for the SNORT rule set can be done with one TCAM and one SRAM lookup using a 135 KB TCAM.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120991422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient multicast on a terabit router","authors":"Punit Bhargava, Sriram C. Krishnan, R. Panigrahy","doi":"10.1109/CONECT.2004.1375203","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375203","url":null,"abstract":"Multicast routing protocols and routers on the Internet enable multicast transmission by replicating packets close to the destinations, obviating the need for multiple unicast connections, thereby saving network bandwidth and improving throughput. Similarly, within a router, multicast between linecards is enabled by a multicast capable switch fabric. A multicast cell is sent once from the source linecard to the switch fabric; the switch fabric sends the cells to all the destination linecards obviating the need for, and the waste of, linecard to fabric bandwidth that would result from multiple unicast cell transmissions. For high capacity routers (several terabits), the fixed size destination field of the cell is inadequate to specify exactly the subset of the switch ports the multicast cell should be sent to the number of multicast connections to be supported. Therefore, for several connections, we have to supercast, i.e., send the cell to non-subscribing linecards and have them drop the cell. We study the problem of assigning destination labels for multicast cells so that the amount of supercast, i.e., wasted bandwidth, is minimized, and the throughput of the router is maximized. We formalize this combinatorial optimization problem and prove it NP-complete and hard to find approximate solutions. We have devised several heuristic algorithms that we have implemented and we report the experimental results. Faster heuristics can support a higher multicast connection establishment rate; slower heuristics can be invoked off-line to further optimize multicast label maps.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122078053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiuxing Liu, A. Mamidala, Abhinav Vishnu, D. Panda
{"title":"Performance evaluation of InfiniBand with PCI Express","authors":"Jiuxing Liu, A. Mamidala, Abhinav Vishnu, D. Panda","doi":"10.1109/CONECT.2004.1375193","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375193","url":null,"abstract":"We present an initial performance evaluation of InfiniBand HCAs (host channel adapters) from Mellanox with PCI Express interfaces. We compare the performance with HCAs using PCI-X interfaces. Our results show that InfiniBand HCAs with PCI Express can achieve significant performance benefits. Compared with HCAs using 64 bit/133 MHz PCI-X interfaces, they can achieve 20%-30% lower latency for small messages. The small message latency achieved with PCI Express is around 3.8 /spl mu/s, compared with the 5.0 /spl mu/s with PCI-X. For large messages, HCAs with PCI Express using a single port can deliver unidirectional bandwidth up to 968 MB/s and bidirectional bandwidth up to 1916 MB/s, which are, respectively, 1.24 and 2.02 times the peak bandwidths achieved by HCAs with PCI-X. When both the ports of the HCAs are activated, HCAs with PCI Express can deliver a peak unidirectional bandwidth of 1486 MB/s and aggregate bidirectional bandwidth up to 2729 MB/s, which are 1.93 and 2.88 times the peak bandwidths obtained using HCAs with PCI-X. PCI Express also improves performance at the MPI level. A latency of 4.6 /spl mu/s with PCI Express is achieved for small messages. And for large messages, unidirectional bandwidth of 1497 MB/s and bidirectional bandwidth of 2724 MB/s are observed.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128371906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Resilient network infrastructures for global grid computing","authors":"L. Valcarenghi","doi":"10.1109/CONECT.2004.1375212","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375212","url":null,"abstract":"Summary form only given. Grid computing is defined as \"coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations\". The transport network infrastructure represents one of the main resources to be shared. Emerging high capacity intelligent grid transport network infrastructures, such as optical transport networks based on generalized multiprotocol label switching (GMPLS) and automatically switched optical networks/automatically switched transport networks (ASON/ASTN), are fostering the expansion of grid computing from local area networks (LAN) (i.e., cluster grid) to wide area networks (WAN) (i.e., global grid). Indeed they are able to guarantee the required quality of service (QoS) to heterogeneous grid applications that share the same grid network infrastructure. The tutorial addresses one particular aspect of the grid transport network QoS: resilience, i.e. the ability to overcome failures. In particular, it gives an overview of the current efforts for guaranteeing grid application resilience in spite of different types of failures, such as network infrastructure failures or computer crashes. Finally, it shows that, by tailoring the utilized recovery scheme to the type of failure that occurred, it is possible to optimize the failure recovery process.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121368557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of a high-speed optical interconnect for scalable shared memory multiprocessors","authors":"Avinash Karanth Kodi, A. Louri","doi":"10.1109/MM.2005.7","DOIUrl":"https://doi.org/10.1109/MM.2005.7","url":null,"abstract":"The paper proposes a highly connected optical interconnect based architecture that maximizes the channel availability for future scalable parallel computers, such as distributed shared memory (DSM) multiprocessors and cluster networks. As the system size increases, various messages (requests, responses and acknowledgments) increase in the network resulting in contention. This results in increasing the remote memory access latency and significantly affects the performance of these parallel computers. As a solution, we propose an architecture called RAPID (reconfigurable and scalable all-photonic interconnect for distributed-shared memory), that provides low remote memory access latency by providing fast and efficient unicast, multicast and broadcast capabilities using a combination of aggressively designed WDM, TDM and SDM techniques. We evaluated RAPID based on network characteristics and by simulation using synthetic traffic workloads and compared it against other networks such as electrical ring, torus, mesh and hypercube networks. We found that RAPID outperforms all networks and satisfies most of the requirements of parallel computer design such as low latency, high bandwidth, high connectivity, and easy scalability.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116096010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Westrelin, Nicolas Fugier, Erik Nordmark, Kai Kunze, E. Lemoine
{"title":"Studying network protocol offload with emulation: approach and preliminary results","authors":"R. Westrelin, Nicolas Fugier, Erik Nordmark, Kai Kunze, E. Lemoine","doi":"10.1109/CONECT.2004.1375208","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375208","url":null,"abstract":"To take full advantage of high-speed networks while freeing CPU cycles for application processing, the industry is proposing new techniques relying on an extended role for network interface cards such as TCP offload engine and remote direct memory access. The paper presents an experimental study aimed at collecting the performance data needed to assess these techniques. This work is based on the emulation of an advanced network interface card plugged on the I/O bus. In the experimental setting, a processor of a partitioned SMP machine is dedicated to network processing. Achieving a faithful emulation of a network interface card is one of the main concerns and it is guiding the design of the offload engine software. This setting has the advantage of being flexible so that many different offload scenarios can be evaluated. Preliminary throughput results of an emulated TCP offload engine demonstrate a large benefit. The emulated TCP offload engine indeed yields 600% to 900% improvement while still relying on memory copies at the kernel boundary.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129503724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Weaver, Daniel P. W. Ellis, Stuart Staniford-Chen, V. Paxson
{"title":"Worms vs. perimeters: the case for hard-LANs","authors":"N. Weaver, Daniel P. W. Ellis, Stuart Staniford-Chen, V. Paxson","doi":"10.1109/CONECT.2004.1375206","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375206","url":null,"abstract":"Network worms - self-propagating network programs - represent a substantial threat to our network infrastructure. Due to the propagation speed of worms, reactive defenses need to be automatic. It is important to understand where and how these defenses need to fit in the network so that they cannot be easily evaded. As there are several mechanisms malcode authors can use to bypass existing perimeter-centric defenses, this position paper argues that substantial defenses need to be embedded in the local area network, thus creating \"hard-LANs\" designed to detect and respond to worm infections. When compared with conventional network intrusion detection systems (NIDSs), we believe that hard-LAN devices need to have two orders of magnitude better cost/performance, and at least two orders of magnitude better accuracy, resulting in substantial design challenges.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122413250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of a wireless enterprise backbone network architecture","authors":"Ashish Raniwala, T. Chiueh","doi":"10.1109/CONECT.2004.1375211","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375211","url":null,"abstract":"IEEE 802.11 wireless LAN technology is mainly used as an access network within corporate enterprises. All the WLAN access points are eventually connected to a wired backbone to reach the Internet or enterprise computing resources. We aim to expand WLAN into an enterprise-scale backbone network technology by developing a multichannel wireless mesh network architecture called Hyacinth. Hyacinth equips each node with multiple IEEE 802.11a/b NICs and supports distributed channel assignment/routing to increase the overall network throughput. We present the results of a detailed performance evaluation study on the multichannel mesh networking aspect of Hyacinth, based on both NS-2 simulations and empirical measurements collected from a 9-node Hyacinth prototype testbed. A key result of this study is that equipping each node of a Hyacinth network with just 3 NICs can increase the total network bandwidth by a factor of 6 to 7, as compared with single-channel wireless mesh network architecture.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122753368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}