Optimized Core-Links for Low-Latency NoCs

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI:10.1109/PDP.2015.15

Ryuta Kawano, S. Tade, I. Fujiwara, Hiroki Matsutani, H. Amano, M. Koibuchi

{"title":"Optimized Core-Links for Low-Latency NoCs","authors":"Ryuta Kawano, S. Tade, I. Fujiwara, Hiroki Matsutani, H. Amano, M. Koibuchi","doi":"10.1109/PDP.2015.15","DOIUrl":null,"url":null,"abstract":"In recent many-core architectures, the number of cores has been steadily increasing and thus the network latency between cores becomes an important issue for parallel application programs. Because packet-switched network structures are widely used for core-to-core communications, a topology among cores has a major impact on the network latency. It has been reported that a small-world Network-on-Chip that adds links between randomly-selected routers on a regular router topology is effective for reducing the network latency. In this study, we extend this framework by connecting multiple links between a single core and quasi-optimally selected neigh boring routers to form multiple links from each core on a 2D MESH router topology. Results obtained by a flit-level discrete event simulator show that our optimized core-link topologies can achieve the average latency up to 48% lower than that of baseline topologies. Furthermore, full-system CMP simulation results show that by using optimized core-links we can improve the application execution time on the NAS Parallel Benchmarks by up to 10.1%.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2015.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In recent many-core architectures, the number of cores has been steadily increasing and thus the network latency between cores becomes an important issue for parallel application programs. Because packet-switched network structures are widely used for core-to-core communications, a topology among cores has a major impact on the network latency. It has been reported that a small-world Network-on-Chip that adds links between randomly-selected routers on a regular router topology is effective for reducing the network latency. In this study, we extend this framework by connecting multiple links between a single core and quasi-optimally selected neigh boring routers to form multiple links from each core on a 2D MESH router topology. Results obtained by a flit-level discrete event simulator show that our optimized core-link topologies can achieve the average latency up to 48% lower than that of baseline topologies. Furthermore, full-system CMP simulation results show that by using optimized core-links we can improve the application execution time on the NAS Parallel Benchmarks by up to 10.1%.

查看原文本刊更多论文

针对低延迟noc优化的核心链路

在近年来的多核体系结构中，随着核数的不断增加，核间网络延迟成为并行应用程序面临的一个重要问题。由于分组交换网络结构广泛用于核心到核心通信，因此核心之间的拓扑结构对网络延迟有重大影响。据报道，在常规路由器拓扑上随机选择的路由器之间添加链路的小世界片上网络可以有效地减少网络延迟。在本研究中，我们扩展了该框架，通过连接单个核心和准最优选择的相邻路由器之间的多条链路，在2D MESH路由器拓扑上形成每个核心的多条链路。通过飞动级离散事件模拟器获得的结果表明，我们优化的核心链路拓扑比基线拓扑的平均延迟降低了48%。此外，全系统CMP仿真结果表明，通过使用优化的核心链接，我们可以将应用程序在NAS并行基准上的执行时间提高10.1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing

自引率

0.00%

发文量