{"title":"Design of TSV-Sharing Topologies for Cost-Effective 3D Networks-on-Chip","authors":"Poona Bahrebar, D. Stroobandt","doi":"10.1145/2835512.2835514","DOIUrl":"https://doi.org/10.1145/2835512.2835514","url":null,"abstract":"The Through-Silicon Via (TSV) technology has led to major breakthroughs in 3D stacking by providing higher speed and bandwidth, as well as lower power dissipation for the inter-layer communication. However, the current TSV fabrication suffers from a considerable area footprint and yield loss. Thus, it is necessary to restrict the number of TSVs in order to design cost-effective 3D on-chip networks. This critical issue can be addressed by clustering the network such that all of the routers within each cluster share a single TSV pillar for the vertical packet transmission. In some of the existing topologies, additional cluster routers are augmented into the mesh structure to handle the shared TSVs. However, they impose either performance degradation or power/area overhead to the system. Furthermore, the resulting architecture is no longer a mesh. In this paper, we redefine the clusters by replacing some routers in the mesh with the cluster routers, such that the mesh structure is preserved. The simulation results demonstrate a better equilibrium between performance and cost, using the proposed models.","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127778426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Awet Yemane Weldezion, M. Ebrahimi, M. Daneshtalab, H. Tenhunen
{"title":"Automated Power and Latency Management in Heterogeneous 3D NoCs","authors":"Awet Yemane Weldezion, M. Ebrahimi, M. Daneshtalab, H. Tenhunen","doi":"10.1145/2835512.2835517","DOIUrl":"https://doi.org/10.1145/2835512.2835517","url":null,"abstract":"Beside different core sizes in many-core Systems-on-Chip, the cost and reliability issues of TSVs move 3D NoCs toward heterogonous designs. Such heterogeneity introduces design complexity and new challenges for obtaining a high performance, low power, low area, and a reliable design. By taking all these factors into account, we propose a design as a combination of Q-Learning and deflection routing in a heterogeneous 3D NoCs. This design enables the routing algorithm to dynamically adjust itself to the underlying traffic condition and topology arrangement at run time. Thereby, the network can reach its optimal performance and minimum power consumption shortly after a reconfiguration either because of an occurred fault in the network or a traffic change.","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"312 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124440368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rethinking Memory System Design (along with Interconnects)","authors":"O. Mutlu","doi":"10.1145/2835512.2835520","DOIUrl":"https://doi.org/10.1145/2835512.2835520","url":null,"abstract":"The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck [27, 28]. At the same time, DRAM technology is experiencing difficult circuit and device scaling challenges that make the maintenance and enhancement of its capacity, energy-efficiency, and reliability significantly more costly with conventional techniques (see, for example [7, 8, 11, 12, 15, 17, 18, 22, 23, 32]). In this talk, we examine some promising research and design directions to overcome challenges posed by memory scaling. Specifically, we discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, and better integration of the memory and the rest of the system, including interconnects (e.g., [1, 2, 19, 20, 34-36]), 2) designing a memory system that intelligently employs multiple memory technologies and coordinates memory and storage management using non-volatile memory technologies (e.g., [16-18, 24, 25, 32, 33, 40-42]), 3) providing predictable performance and QoS to applications sharing the memory system (e.g., [3, 9, 10, 13, 14, 26, 29, 37-39]). As we discuss challenges and solution directions in memory, we will point out research opportunities in interconnects and memory-interconnect co-design (e.g., [2, 4-6, 19, 21, 30, 31]).","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121234261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Task mapping and communication routing model for minimizing power consumption in multi-cores","authors":"Sergiu Carpov","doi":"10.1145/2835512.2835515","DOIUrl":"https://doi.org/10.1145/2835512.2835515","url":null,"abstract":"In this paper we introduce a novel MILP formulation for the problem of mapping tasks and routing communications on multi-core systems with power minimization objective. The cores have several power consumption modes. Dynamic and static power consumptions are modeled independently and the dynamic power consumption depends on core load rate. Three types of communication routing are examined: single-path, multi-path and fractional multi-path. Initially a mathematical model is introduced and afterwards a linearized mixed-integer program formulation is proposed. We conclude the paper by presenting computational results on task graph instances obtained from StreamIt applications.","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122149785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Methodology to verify, debug and evaluate performances of NoC based interconnects","authors":"Patrick Oury, N. Heaton, Stewart Penman","doi":"10.1145/2835512.2835521","DOIUrl":"https://doi.org/10.1145/2835512.2835521","url":null,"abstract":"Latest developments in electronic systems lead hardware architects to completely rethink the traffic exchanges inside Systems on Chip. Constraints on more parallelism, less power, more coherency, less order while designing system fabrics and interconnects infers a bunch of new design features. All of them need to be verified, all of them need to be evaluated with respect to their impact on system performance, power consumption and gate count. This paper discusses some of the most challenging features in NoCs and the way verification engineers and architects are tackling correctness and performance checking of them.","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130538152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Low-Latency and High-Throughput Multiple-Level Arbitration Scheme Supporting Quality-of-Service in Optical On-chip Network","authors":"Jian Jie, Lai Mingche, Xiao Liquan","doi":"10.1145/2835512.2835519","DOIUrl":"https://doi.org/10.1145/2835512.2835519","url":null,"abstract":"As a key technology in optical NoC design, the arbitration scheme should provide differential arbitration service with high throughput and low latency for various types and priorities of traffic in CMPs. In this work, we propose a fast hierarchical arbitration supporting Quality-of-Service. With a multi-priority data buffer queue, arbiters provide differential transmissions and guarantee service for all queues. Our arbiter also presents the transmit bound resource reservation scheme to reserve time slots for all nodes fairly. We propose fast arbitration with a layout of fast optical arbitration channels to decrease the arbitration period, thereby reducing packet transmitting delay. The simulation results show that with our hierarchical arbitration scheme, all nodes are allocated with almost equal service under various patterns; thus, the min-communication-bandwidth and max-transmit-delay is guaranteed to be 5% and 80 cycles under the overload demands. This scheme improves throughput by 17% compared to FeatherWeight under a self-similar traffic pattern and decreases arbitration delay by 15% to 2-pass arbitration.","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115314870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NoCVision: A Network-on-Chip Dynamic Visualization Solution","authors":"V. Gogte, Doowon Lee, Ritesh Parikh, V. Bertacco","doi":"10.1145/2835512.2835518","DOIUrl":"https://doi.org/10.1145/2835512.2835518","url":null,"abstract":"Networks-on-chip (NoCs) are the communication infrastructure of choice for integrating the many components of modern silicon systems, deployed anywhere from systems-on-chip, to chip multi-processors, and to heterogeneous systems. The growing design complexity of these systems, coupled with shrinking times-to-market, requires efficient analysis of complex applications mapped onto the network in a short span of time. In this work, we propose NoCVision, a novel platform for the analysis of NoC characteristics and traffic flows. NoCVision enables design-space exploration, performance tuning, and validation of the NoC subsystem. It allows to consolidate and summarize the network's simulation data and visualize it through intuitive diagrams and plots, either in a static form or animating it to depict changes occurring over time during an application's execution. To showcase the features and benefits of NoCVision, we present several case studies developed on a 64-node CMP organized in a 8x8 mesh NoC and running multi-programmed workloads.","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128527503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Ax, Gregor Sievers, Martin Flasskamp, W. Kelly, T. Jungeblut, Mario Porrmann
{"title":"System-Level Analysis of Network Interfaces for Hierarchical MPSoCs","authors":"J. Ax, Gregor Sievers, Martin Flasskamp, W. Kelly, T. Jungeblut, Mario Porrmann","doi":"10.1145/2835512.2835513","DOIUrl":"https://doi.org/10.1145/2835512.2835513","url":null,"abstract":"Network Interfaces (NIs) are used in Multiprocessor System-on-Chips (MPSoCs) to connect CPUs to a packet switched Network-on-Chip. In this work we introduce a new NI architecture for our hierarchical CoreVA-MPSoC. The CoreVA-MPSoC targets streaming applications in embedded systems. The main contribution of this paper is a system-level analysis of different NI configurations, considering both software and hardware costs for NoC communication. Different configurations of the NI are compared using a benchmark suite of 10 streaming applications. The best performing NI configuration shows an average speedup of 20 for a CoreVA-MPSoC with 32 CPUs compared to a single CPU. Furthermore, we present physical implementation results using a 28 nm FD-SOI standard cell technology. A hierarchical MPSoC with 8 CPU clusters and 4 CPUs in each cluster running at 800MHz requires an area of 4.56mm2.","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128883411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 8th International Workshop on Network on Chip Architectures","authors":"","doi":"10.1145/2835512","DOIUrl":"https://doi.org/10.1145/2835512","url":null,"abstract":"","PeriodicalId":424680,"journal":{"name":"Proceedings of the 8th International Workshop on Network on Chip Architectures","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127720579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}