{"title":"Early Experiences with Saving Energy in Direct Interconnection Networks","authors":"F. Zahn, S. Lammel, H. Fröning","doi":"10.1109/HiPINEB.2017.10","DOIUrl":"https://doi.org/10.1109/HiPINEB.2017.10","url":null,"abstract":"Energy is emerging to become one of the most crucial factors in design decisions for future large scale computing systems. Especially Exascale-installations will have to operate within hard power and energy constraints. Besides economical reasons, power consumption is also limited by a limited power distribution, cooling capabilities, and minimization of carbon footprints. While other components, such as processors, become more and more energy-proportional, interconnects are still highly energy-disproportional. Although interconnection networks are contributing only about 10-20% to the overall power consumption of High-Performance Computing (HPC) or Cloud systems, this fraction is likely to increase significantly in the near future. Therefore, power saving strategies are mandatory for improving energy efficiency and thereby performance within hard power constraints. In this work, we introduce a simple energy saving strategy, which switches links on and off, depending on the user's performance constraints. Therefore, we adapted an existing OMNeT++ network simulator by adding new energy features. This simulator allows us to run traces of real world applications, including LULESH, NAMD, and Graph500 with different configurations. We show that this policy enables possible energy savings of up to 39% in interconnection networks. Furthermore, we demonstrate the impact of hardware design parameters, such as transition time, on possible power saving strategies.","PeriodicalId":426494,"journal":{"name":"2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130555185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Shpiner, Zachy Haramaty, Saar Eliad, Vladimir Zdornov, B. Gafni, E. Zahavi
{"title":"Dragonfly+: Low Cost Topology for Scaling Datacenters","authors":"Alexander Shpiner, Zachy Haramaty, Saar Eliad, Vladimir Zdornov, B. Gafni, E. Zahavi","doi":"10.1109/HiPINEB.2017.11","DOIUrl":"https://doi.org/10.1109/HiPINEB.2017.11","url":null,"abstract":"Dragonfly topology was introduced by Kim et al. [1] aiming to decrease the cost and diameter of the network. The topology divides routers into groups connected by longlinks. Each group strives to implement high-radix virtual router, connected by a completely-connected topology. In this paper, we propose an extended Dragonfly+ networkin which routers inside the group are connected in Clos-liketopology. Dragonfly+ is superior to conventional Dragonfly due to the significantly larger number of hosts which it is able to support. In addition, Dragonfly+ supports similar or better bi-sectional bandwidth for various traffic patterns, and requires smaller number of buffers to avoid credit loop deadlocks in lossless networks. Moreover, we introduce a novel Fully Progressive Adaptive Routing algorithm with remote congestion notifications. To support our proposal we present analytical analysis and simulations.","PeriodicalId":426494,"journal":{"name":"2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)","volume":"29 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120856972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knapp: A Packet Processing Framework for Manycore Accelerators","authors":"Junhyun Shim, Joongi Kim, Keunhong Lee, S. Moon","doi":"10.1109/HiPINEB.2017.8","DOIUrl":"https://doi.org/10.1109/HiPINEB.2017.8","url":null,"abstract":"High-performance network packet processing benefits greatly from parallel-programming accelerators such as Graphics Processing Units (GPUs). Intel Xeon Phi, a relative newcomer in this market, is a distinguishing platform because its x86-compatible vectorized architecture offers additional optimization opportunities. Its software stack exposes low-level communication primitives, enabling fine-grained control and optimization of offloading processes. Nonetheless, our microbenchmarks show that offloading APIs for Xeon Phi comes in short for combining low latency and high throughput for both I/O and computation. In this work, we exploit Xeon Phi's low-level threading mechanisms to design a new offloading framework, Knapp, and evaluate it using simplified IP routing applications. Knapp lays the ground for full exploitation of Xeon Phi as a packet processing framework.","PeriodicalId":426494,"journal":{"name":"2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134466299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Belka, Myra Doubet, S. Meyers, Rose L. Momoh, David Rincon-Cruz, David P. Bunde
{"title":"New Link Arrangements for Dragonfly Networks","authors":"M. Belka, Myra Doubet, S. Meyers, Rose L. Momoh, David Rincon-Cruz, David P. Bunde","doi":"10.1109/HiPINEB.2017.14","DOIUrl":"https://doi.org/10.1109/HiPINEB.2017.14","url":null,"abstract":"Dragonfly networks have been proposed to exploit high-radix routers and optical links for high performance computing (HPC) systems. Such networks divide the switches into groups, with a local link between each pair of switches in a group and a global link between each group. Which specific switch serves as the endpoint of each global link is determined by the network's global link arrangement. We propose two new global link arrangements, each designed using intuition of how to optimize bisection bandwidth when global links have high bandwidth relative to local links. Despite this, the new arrangements generally outperform previously-known arrangements for all bandwidth relationships.","PeriodicalId":426494,"journal":{"name":"2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)","volume":"344 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133806944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending Commodity OpenFlow Switches for Large-Scale HPC Deployments","authors":"M. Benito, E. Vallejo, R. Beivide, C. Izu","doi":"10.1109/HiPINEB.2017.12","DOIUrl":"https://doi.org/10.1109/HiPINEB.2017.12","url":null,"abstract":"Commodity Ethernet networks are used in many HPC systems. Extensions based on OpenFlow have been proposed for large HPC deployments, considering scalability and power consumption concerns. Such designs employ low-diameter topologies to minimize power consumption, such as Flattened Butterflies or Dragonflies. However, these topologies require non-minimal adaptive routing to deal with varying traffic characteristics and avoid pathological behaviors. The solutions to this issue in previous work relies on Ethernet Pauses to adapt minimal or non-minimal routing, depending on the availability (Pause status) of each corresponding output port. Nevertheless, such design provides an undesired high average latency under adversarial traffic patterns and a reduction in peak throughput under uniform traffic. This paper identifies the causes of the issues presented above, and presents a preliminary study of alternative solutions based on exploiting commodity congestion notification messages (QCN, 802.1Qau), currently available in Datacenter switches. This work presents the main differences between a congestion control mechanism such as QCN, which performs injection throttling reducing average network load, and an adaptive routing mechanism, which diverts traffic away from the congested area but increases average network load. In particular, it identifies the difficulty of separating the cases of uniform traffic at saturation and adversarial traffic at low loads.","PeriodicalId":426494,"journal":{"name":"2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121269991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Isolating Jobs for Security on High-Performance Fabrics","authors":"Matthieu Perotin, Tom Cornebize","doi":"10.1109/HiPINEB.2017.13","DOIUrl":"https://doi.org/10.1109/HiPINEB.2017.13","url":null,"abstract":"The various pieces of equipment in super-computers are shared between jobs, that belong to different users. This situation raises security concerns. Jobs must not be able to conduct denial of service attacks targeting other jobs (voluntarily or accidentally). Moreover, job isolation must be guaranteed: unauthorized communication between two different jobs should not be allowed. However, high-performance interconnects are designed with performance as their main objective, and bypass the OS and its security models. In this paper, we show that by acting at the routing table level, it is possible to enforce job isolation without impacting job performance. Moreover, the isolation process can be dynamic, quick to set-up, with algorithms that are both independent from the routing algorithms and the interconnect topology.","PeriodicalId":426494,"journal":{"name":"2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128346901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francisco J. Andújar, Juan A. Villar, J. L. Sánchez, F. J. Alfaro, J. Duato, H. Fröning
{"title":"A Case Study on Implementing Virtual 5D Torus Networks Using Network Components of Lower Dimensionality","authors":"Francisco J. Andújar, Juan A. Villar, J. L. Sánchez, F. J. Alfaro, J. Duato, H. Fröning","doi":"10.1109/HiPINEB.2017.7","DOIUrl":"https://doi.org/10.1109/HiPINEB.2017.7","url":null,"abstract":"Several of the most powerful supercomputers in the Top500 and the Graph500 lists continue choosing a torus topology to interconnect a large number of compute nodes. In some cases, a torus network with five or six dimensions is implemented, however, one notices that the costs of implementing an interconnection network increase with the node degree. In previous works we defined and characterized the nD Twin (nDT) torus topology in order to virtually increase the dimensionality of a torus. This new topology reduces the distances between nodes and therefore increases network performance. In this work, we present how to build a 5DT torus network using commercial 6-port network cards. The main issues of this approach are detailed, and we present solutions these problems. Moreover we show, using the same components, that the performance of the 5DT torus network is higher than the performance of the 3D torus network for the same number of compute nodes.","PeriodicalId":426494,"journal":{"name":"2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123715018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}