2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip最新文献

Fine-Grained Bandwidth Adaptivity in Networks-on-Chip Using Bidirectional Channels 基于双向信道的片上网络的细粒度带宽自适应

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.23

R. Hesse, J. Nicholls, Natalie D. Enright Jerger

{"title":"Fine-Grained Bandwidth Adaptivity in Networks-on-Chip Using Bidirectional Channels","authors":"R. Hesse, J. Nicholls, Natalie D. Enright Jerger","doi":"10.1109/NOCS.2012.23","DOIUrl":"https://doi.org/10.1109/NOCS.2012.23","url":null,"abstract":"Networks-on-Chip (NoC) serve as efficient and scalable communication substrates for many-core architectures. Currently, the bandwidth provided in NoCs is over provisioned for their typical usage case. In real-world multi-core applications, less than 5% of channels are utilized on average. Large bandwidth resources serve to keep network latency low during periods of peak communication demands. Increasing the average channel utilization through narrower channels could improve the efficiency of NoCs in terms of area and power, however, in current NoC architectures this degrades overall system performance. Based on thorough analysis of the dynamic behaviour of real workloads, we design a novel NoC architecture that adapts to changing application demands. Our architecture uses fine-grained bandwidth-adaptive bidirectional channels to improve channel utilization without negatively affecting network latency. Running PARSEC benchmarks on a cycle-accurate full-system simulator, we show that fine-grained bandwidth adaptivity can save up to 75% of channel resources while achieving 92% of overall system performance compared to the baseline network, no performance is sacrificed in our network design configured with 50% of the channel resources used in the baseline.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"18 1","pages":"132-141"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73663151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect MinBD:最小缓冲偏转路由节能互连

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.8

Chris Fallin, Greg Nazario, Xiangyao Yu, K. Chang, Rachata Ausavarungnirun, O. Mutlu

{"title":"MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect","authors":"Chris Fallin, Greg Nazario, Xiangyao Yu, K. Chang, Rachata Ausavarungnirun, O. Mutlu","doi":"10.1109/NOCS.2012.8","DOIUrl":"https://doi.org/10.1109/NOCS.2012.8","url":null,"abstract":"A conventional Network-on-Chip (NoC) router uses input buffers to store in-flight packets. These buffers improve performance, but consume significant power. It is possible to bypass these buffers when they are empty, reducing dynamic power, but static buffer power, and dynamic power when buffers are utilized, remain. To improve energy efficiency, buffer less deflection routing removes input buffers, and instead uses deflection (misrouting) to resolve contention. However, at high network load, deflections cause unnecessary network hops, wasting power and reducing performance. In this work, we propose a new NoC router design called the minimally-buffered deflection (MinBD) router. This router combines deflection routing with a small \"side buffer,\" which is much smaller than conventional input buffers. A MinBD router places some network traffic that would have otherwise been deflected in this side buffer, reducing deflections significantly. The router buffers only a fraction of traffic, thus making more efficient use of buffer space than a router that holds every flit in its input buffers. We evaluate MinBD against input-buffered routers of various sizes that implement buffer bypassing, a buffer less router, and a hybrid design, and show that MinBD is more energy efficient than all prior designs, and has performance that approaches the conventional input-buffered router with area and power close to the buffer less router.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"49 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80986436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 128

A Hybrid Buffer Design with STT-MRAM for On-Chip Interconnects 片上互连的STT-MRAM混合缓冲设计

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.30

Hyunjun Jang, Baik Song An, Nikhil Kulkarni, K. H. Yum, Eun Jung Kim

{"title":"A Hybrid Buffer Design with STT-MRAM for On-Chip Interconnects","authors":"Hyunjun Jang, Baik Song An, Nikhil Kulkarni, K. H. Yum, Eun Jung Kim","doi":"10.1109/NOCS.2012.30","DOIUrl":"https://doi.org/10.1109/NOCS.2012.30","url":null,"abstract":"As the chip multiprocessor (CMP) design moves toward many-core architectures, communication delay in Network-on-Chip (NoC) has been a major bottleneck in CMP systems. Using high-density memories in input buffers helps to reduce the bottleneck through increasing throughput. Spin-Torque Transfer Magnetic RAM (STT-MRAM) can be a suitable solution due to its nature of high density and near-zero leakage power. But its long latency and high power consumption in write operations still need to be addressed. We explore the design issues in using STT-MRAM for NoC input buffers. Motivated by short intra-router latency, we use the previously proposed write latency reduction technique sacrificing retention time. Then we propose a hybrid design of input buffers using both SRAM and STT-MRAM to hide the long write latency efficiently. Considering that simple data migration in the hybrid buffer consumes more dynamic power compared to SRAM, we provide a lazy migration scheme that reduces the dynamic power consumption of the hybrid buffer. Simulation results show that the proposed scheme enhances the throughput by 21% on average.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"13 1","pages":"193-200"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78300415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Transient and Permanent Error Control for High-End Multiprocessor Systems-on-Chip 高端多处理器片上系统的瞬态和永久错误控制

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.27

Qiaoyan Yu, José Cano, J. Flich, P. Ampadu

{"title":"Transient and Permanent Error Control for High-End Multiprocessor Systems-on-Chip","authors":"Qiaoyan Yu, José Cano, J. Flich, P. Ampadu","doi":"10.1109/NOCS.2012.27","DOIUrl":"https://doi.org/10.1109/NOCS.2012.27","url":null,"abstract":"High-end MPSoC systems with built-in high-radix topologies achieve good performance because of the improved connectivity and the reduced network diameter. In high-end MPSoC systems, fault tolerance support is becoming a compulsory feature. In this work, we propose a combined method to address permanent and transient link and router failures in those systems. The LBDRhr mechanism is proposed to tolerate permanent link failures in some popular high-radix topologies. The increased router complexity may lead to more transient router errors than routers using simple XY routing algorithm. We exploit the inherent information redundancy (IIR) in LBDRhr logic to manage transient errors in the network routers. Thorough analyses are provided to discover the appropriate internal nodes and the forbidden signal patterns for transient error detection. Simulation results show that LBDRhr logic can tolerate all of the permanent failure combinations of long-range links and 80% of links failures at short-range links. Case studies show that the error detection method based on the new IIR extraction method reduces the power consumption and the residual error rate by 33% and up to two orders of magnitude, respectively, compared to triple modular redundancy. The impact of network topologies on the efficiency of the detection mechanism has been examined in this work, as well.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"35 1","pages":"169-176"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87985321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers CCNoC:专为缓存一致服务器的能源效率的片上互连

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.15

Stavros Volos, Ciprian Seiculescu, Boris Grot, Naser Khosro Pour, B. Falsafi, G. Micheli

{"title":"CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers","authors":"Stavros Volos, Ciprian Seiculescu, Boris Grot, Naser Khosro Pour, B. Falsafi, G. Micheli","doi":"10.1109/NOCS.2012.15","DOIUrl":"https://doi.org/10.1109/NOCS.2012.15","url":null,"abstract":"Many core chips are emerging as the architecture of choice to provide power efficiency and improve performance, while riding Moore's Law. In these architectures, on-chip inter-connects play a pivotal role in ensuring power and performance scalability. As supply voltages begin to level off in future technologies, chip designs in general and interconnects in particular will require specialization to meet power and performance objectives. In this work, we make the observation that cache-coherent many core server chips exhibit a duality in on-chip network traffic. Request traffic largely consists of simple control messages, while response traffic often carries cache-block-sized payloads. We present Cache-Coherence Network-on-Chip (CCNoC), a design that specializes the NoC to fit the demands of server workloads via a pair of asymmetric networks tuned to the type of traffic traversing them. The networks differ in their data path width, router micro architecture, flow control strategy, and delay. The resulting heterogeneous CCNoC architecture enables significant gains in power efficiency over conventional NoC designs at similar performance levels. Our evaluation reveals that a 4×4 mesh-based chip multiprocessor with the proposed CCNoC organization running commercial server workloads is 15-28% more energy efficient than various state-of-the-art single- and dual-network organizations.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"10 1","pages":"67-74"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91147027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 62

An Optimal Control Approach to Power Management for Multi-Voltage and Frequency Islands Multiprocessor Platforms under Highly Variable Workloads 高可变负载下多电压频岛多处理器平台电源管理的最优控制方法

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.32

P. Bogdan, R. Marculescu, Siddhartha Jain, Rafael Tornero Gavilá

{"title":"An Optimal Control Approach to Power Management for Multi-Voltage and Frequency Islands Multiprocessor Platforms under Highly Variable Workloads","authors":"P. Bogdan, R. Marculescu, Siddhartha Jain, Rafael Tornero Gavilá","doi":"10.1109/NOCS.2012.32","DOIUrl":"https://doi.org/10.1109/NOCS.2012.32","url":null,"abstract":"Reducing energy consumption in multi-processor systems-on-chip (MPSoCs) where communication happens via the network-on-chip (NoC) approach calls for multiple voltage/frequency island (VFI)-based designs. In turn, such multi-VFI architectures need efficient, robust, and accurate run-time control mechanisms that can exploit the workload characteristics in order to save power. Despite being tractable, the linear control models for power management cannot capture some important workload characteristics (e.g., fractality, non-stationarity) observed in heterogeneous NoCs, if ignored, such characteristics lead to inefficient communication and resources allocation, as well as high power dissipation in MPSoCs. To mitigate such limitations, we propose a new paradigm shift from power optimization based on linear models to control approaches based on fractal-state equations. As such, our approach is the first to propose a controller for fractal workloads with precise constraints on state and control variables and specific time bounds. Our results show that significant power savings (about 70%) can be achieved at run-time while running a variety of benchmark applications.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"14 1","pages":"35-42"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76164800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 72

A Novel Flit Serialization Strategy to Utilize Partially Faulty Links in Networks-on-Chip 一种利用片上网络部分故障链路的Flit串行化策略

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.22

Changlin Chen, Ye Lu, S. Cotofana

引用次数: 21

Analytical Performance Modeling of Hierarchical Interconnect Fabrics 分层互连结构的分析性能建模

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.20

N. Nikitin, Javier de San Pedro, J. Carmona, J. Cortadella

引用次数: 13

Hierarchical Network-on-Chip and Traffic Compression for Spiking Neural Network Implementations 尖峰神经网络实现的分层片上网络和流量压缩

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.17

Snaider Carrillo, J. Harkin, L. McDaid, S. Pande, Seamus Cawley, Brian McGinley, F. Morgan

{"title":"Hierarchical Network-on-Chip and Traffic Compression for Spiking Neural Network Implementations","authors":"Snaider Carrillo, J. Harkin, L. McDaid, S. Pande, Seamus Cawley, Brian McGinley, F. Morgan","doi":"10.1109/NOCS.2012.17","DOIUrl":"https://doi.org/10.1109/NOCS.2012.17","url":null,"abstract":"The complexity of inter-neuron connectivity is prohibiting scalable hardware implementations of spiking neural networks (SNNs). Traditional neuron interconnect using a shared bus topology is not scalable due to non-linear growth of neuron connections with the neural network size. This paper presents a novel hierarchical NoC (H-NoC) architecture for SNN hardware which addresses the scalability issue by creating a 3-dimensional array of clusters of neurons with a hierarchical structure of low and high-level routers. The H-NoC architecture also incorporates a spike traffic compression technique to exploit SNN traffic patterns, thus reducing traffic overhead and improving throughput on the network. In addition, adaptive routing capabilities between clusters balance local and global traffic loads to sustain throughput under bursting activity. Simulation results show a high throughput per cluster (3.33×109 spikes/second), and synthesis results using 65-nm CMOS technology demonstrate low cost area (0.587mm2) and power consumption (13.16mW @100MHz) for a single cluster of 400 neurons, which outperforms existing SNN hardware strategies.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"136 1","pages":"83-90"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76385798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Generic Monitoring and Management Infrastructure for 3D NoC-Bus Hybrid Architectures 3D NoC-Bus混合架构的通用监控和管理基础设施

2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip Pub Date : 2012-05-09 DOI: 10.1109/NOCS.2012.28

A. Rahmani, Kameswar Rao Vaddina, Khalid Latif, P. Liljeberg, J. Plosila, H. Tenhunen

{"title":"Generic Monitoring and Management Infrastructure for 3D NoC-Bus Hybrid Architectures","authors":"A. Rahmani, Kameswar Rao Vaddina, Khalid Latif, P. Liljeberg, J. Plosila, H. Tenhunen","doi":"10.1109/NOCS.2012.28","DOIUrl":"https://doi.org/10.1109/NOCS.2012.28","url":null,"abstract":"Three-dimensional integrated circuits (3D ICs) achieve enhanced system integration and improved performance at lower cost and reduced area footprint. In order to exploit the intrinsic capability of reducing the wire length in 3D ICs, 3D NoC-Bus Hybrid mesh architecture was proposed which provides performance, power consumption, and area benefits. Besides its various advantages, this architecture has a unique and hitherto previously unexplored way to implement an efficient system-wide monitoring network. In this paper, an integrated low-cost monitoring platform for 3D stacked mesh architectures is proposed which can be efficiently used for various system management purposes such as traffic monitoring, thermal management and fault tolerance. The proposed generic monitoring and management infrastructure called ARB-NET utilizes bus arbiters to exchange the monitoring information directly with each other without using the data network. As a test case, based on the proposed monitoring and management platform, a fully congestion-aware and inter-layer fault tolerant routing algorithm named AdaptiveXYZ is presented taking advantage of viable information generated using bus arbiter network. In addition, we propose a thermal monitoring and management strategy on top of our ARB-NET infrastructure. Compared to recently proposed stacked mesh 3D NoCs, our extensive simulations with synthetic and real benchmarks reveal that our architecture using the AdaptiveXYZ routing can help in achieving significant power and performance improvements while preserving the system reliability with negligible hardware overhead.","PeriodicalId":6333,"journal":{"name":"2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip","volume":"108 4 1","pages":"177-184"},"PeriodicalIF":0.0,"publicationDate":"2012-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84815980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17