Sunghyun Park, T. Krishna, C. Chen, Bhavya K. Daya, A. Chandrakasan, L. Peh
{"title":"Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI","authors":"Sunghyun Park, T. Krishna, C. Chen, Bhavya K. Daya, A. Chandrakasan, L. Peh","doi":"10.1145/2228360.2228431","DOIUrl":null,"url":null,"abstract":"In this paper, we present a case study of our chip prototype of a 16-node 4×4 mesh NoC fabricated in 45nm SOI CMOS that aims to simultaneously optimize energy-latency-throughput for unicasts, multicasts and broadcasts. We first define and analyze the theoretical limits of a mesh NoC in latency, throughput and energy, then describe how we approach these limits through a combination of microarchitecture and circuit techniques. Our 1.1V 1GHz NoC chip achieves 1-cycle router-and-link latency at each hop and energy-efficient router-level multicast support, delivering 892Gb/s (87.1% of the theoretical bandwidth limit) at 531.4mW for a mixed traffic of unicasts and broadcasts. Through this fabrication, we derive insights that help guide our research, and we believe, will also be useful to the NoC and multicore research community.","PeriodicalId":263599,"journal":{"name":"DAC Design Automation Conference 2012","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"107","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DAC Design Automation Conference 2012","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2228360.2228431","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 107
Abstract
In this paper, we present a case study of our chip prototype of a 16-node 4×4 mesh NoC fabricated in 45nm SOI CMOS that aims to simultaneously optimize energy-latency-throughput for unicasts, multicasts and broadcasts. We first define and analyze the theoretical limits of a mesh NoC in latency, throughput and energy, then describe how we approach these limits through a combination of microarchitecture and circuit techniques. Our 1.1V 1GHz NoC chip achieves 1-cycle router-and-link latency at each hop and energy-efficient router-level multicast support, delivering 892Gb/s (87.1% of the theoretical bandwidth limit) at 531.4mW for a mixed traffic of unicasts and broadcasts. Through this fabrication, we derive insights that help guide our research, and we believe, will also be useful to the NoC and multicore research community.
在本文中,我们介绍了一个用45nm SOI CMOS制造的16节点4×4网状NoC芯片原型的案例研究,该芯片旨在同时优化单播、多播和广播的能量延迟吞吐量。我们首先定义和分析网状NoC在延迟、吞吐量和能量方面的理论极限,然后描述我们如何通过微架构和电路技术的结合来接近这些极限。我们的1.1V 1GHz NoC芯片实现了每跳1周期的路由器和链路延迟,以及节能的路由器级多播支持,在531.4mW的单播和广播混合流量下提供892Gb/s(理论带宽限制的87.1%)。通过这种制造,我们获得了有助于指导我们研究的见解,我们相信,这也将对NoC和多核研究社区有用。