Carlos Nunez Castillo, D. Lugones, Daniel Franco, E. Luque
{"title":"Predictive and Distributed Routing Balancing for High Speed Interconnection Networks","authors":"Carlos Nunez Castillo, D. Lugones, Daniel Franco, E. Luque","doi":"10.1109/CLUSTER.2011.66","DOIUrl":null,"url":null,"abstract":"Current parallel applications in parallel computing systems require an interconnection network to provide low and bounded communication delays. Communication characteristics such as traffic pattern and communication load change over time and, eventually, they may exceed available network capacity causing congestion and performance degradation. Congestion control based on adaptive routing should be applied in order to adapt quickly to changing traffic conditions. Studies on a vast range of parallel applications show repetitive behavior and can be characterized by a set of representative phases. This work presents a Predictive and Distributed Routing Balancing technique (PR-DRB) to control network congestion based on adaptive traffic distribution. PR-DRB uses speculative routing based on application repetitiveness. PR-DRB monitors messages latencies on routers and logs solutions to congestion, to quickly respond in future similar situations. Experimental results show that the predictive approach could be used to improve performance.","PeriodicalId":200830,"journal":{"name":"2011 IEEE International Conference on Cluster Computing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTER.2011.66","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Current parallel applications in parallel computing systems require an interconnection network to provide low and bounded communication delays. Communication characteristics such as traffic pattern and communication load change over time and, eventually, they may exceed available network capacity causing congestion and performance degradation. Congestion control based on adaptive routing should be applied in order to adapt quickly to changing traffic conditions. Studies on a vast range of parallel applications show repetitive behavior and can be characterized by a set of representative phases. This work presents a Predictive and Distributed Routing Balancing technique (PR-DRB) to control network congestion based on adaptive traffic distribution. PR-DRB uses speculative routing based on application repetitiveness. PR-DRB monitors messages latencies on routers and logs solutions to congestion, to quickly respond in future similar situations. Experimental results show that the predictive approach could be used to improve performance.