{"title":"Bi-Synchronous FIFO for Synchronous Circuit Communication Well Suited for Network-on-Chip in GALS Architectures","authors":"I. Panades, A. Greiner","doi":"10.1109/NOCS.2007.14","DOIUrl":"https://doi.org/10.1109/NOCS.2007.14","url":null,"abstract":"The distribution of a synchronous clock in system-on-chip (SoC) has become a problem, because of wire length and process variation. Novel approaches such as the globally asynchronous, locally synchronous try to solve this issue by partitioning the SoC into isolated synchronous islands. This paper describes the bisynchronous FIFO used on the DSPIN network-on-chip capable to interface systems working with different clock signals (frequency and/or phase). Its interfaces are synchronous and its architecture is scalable and synthesizable in synchronous standard cells. The metastability situations and its latency are analyzed. Its throughput, maximum frequency, and area are evaluated in function of the FIFO depth.","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115195612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Goossens, B. Vermeulen, R. .. Steeden, M. Bennebroek
{"title":"Transaction-Based Communication-Centric Debug","authors":"K. Goossens, B. Vermeulen, R. .. Steeden, M. Bennebroek","doi":"10.1109/NOCS.2007.46","DOIUrl":"https://doi.org/10.1109/NOCS.2007.46","url":null,"abstract":"The behaviour of systems on chip (SOC) is complex because they contain multiple processors that interact through concurrent interconnects, such as networks on chip (NOC). Debugging such SOCs is hard. Based on a classification of debug scope and granularity, we propose that debugging should be communication-centric and based on transactions. Communication-centric debug focuses on the communication and the synchronisation between the IP blocks, which are implemented by the interconnect using transactions. We define and implement a modular debug architecture, based on NOC, monitors, and a dedicated high-speed event-distribution broadcast interconnect. The manufacturing-test scan chains and IEEE1149.1 test access ports (TAP) are re-used for configuration and debug data read-out. Our debug architecture requires only small changes to the functional architecture. The additional area cost is limited to the monitors and the event distribution interconnect, which are 4.5% of the NOC area, or less than 0.2% of the SOC area. The debug architecture runs at NOC functional speed and reacts very quickly to debug events to stop the SOC close in time to the condition that raised the event. The speed at which data is retrieved from the SOC after stopping using the TAP is 10 MHz. We prove our concepts and architecture with a gate-level implementation that includes the NOC, event distribution interconnect, and clock, reset, and TAP controllers. We include gate-level signal traces illustrating debug at message and transaction levels","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115851953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cedric Koch-Hofer, M. Renaudin, Y. Thonnart, P. Vivet
{"title":"ASC, a SystemC Extension for Modeling Asynchronous Systems, and Its Application to an Asynchronous NoC","authors":"Cedric Koch-Hofer, M. Renaudin, Y. Thonnart, P. Vivet","doi":"10.1109/NOCS.2007.12","DOIUrl":"https://doi.org/10.1109/NOCS.2007.12","url":null,"abstract":"This paper presents ASC, an Asynchronous SystemC library, as an extension of SystemC for modeling asynchronous circuits. ASC includes a set of port and channel primitives offering the same communication primitives as the common languages used for asynchronous circuits modeling (CHP, Tangram or Balsa). ASC also offers operators and statements in order to accurately model arbiters, which are the basic components of asynchronous network on chips. The aim of this work is to provide to the designers the means of modeling and verifying asynchronous circuits as well as GALS and NoC systems. Synthesis of ASC models with the help of the TAST framework is under development. As an illustrative example, the modeling of an asynchronous network-on-chip architecture using the ASC library is described. This NoC has been successfully integrated into a complex GALS NoC architecture taking advantage of a multi-level SystemC based verification environment","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130892373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Network on Chips","authors":"J. Flich, A. Mejia, P. López, J. Duato","doi":"10.1109/NOCS.2007.39","DOIUrl":"https://doi.org/10.1109/NOCS.2007.39","url":null,"abstract":"The design of scalable and reliable interconnection networks for system on chips (SoCs) introduce new design constraints not present in current multicomputer systems. Although regular topologies are preferred for building NoCs, heterogeneous blocks, fabrication faults and reliability issues derived from the high integration scale may lead to irregular topologies. In this situation, efficient routing becomes a challenge. Although table-based routing allows the use of most routing algorithms on any topology, it does not scale in terms of latency and area. In this paper we propose the region-based routing mechanism that avoids the scalability problems of table-based solutions. From an initial topology and routing algorithm, the mechanism groups, at every switch, destinations into different regions based on the output ports. By doing this, redundant routing information typically found in routing tables is eliminated. Evaluation results show that the mechanism requires only four regions to support several routing algorithms in a 2D mesh with no performance degradation. Moreover, when dealing with link failures, our results indicate that the mechanism combined with the segment-based routing algorithm is able to pack all the routing information into eight regions providing high throughput. The paper provides also a simple and efficient hardware implementation of the mechanism requiring only 240 logic gates per switch to support eight regions in a 2D mesh topology","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132506930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Greenfield, A. Banerjee, Jeong-Gun Lee, S. Moore
{"title":"Implications of Rent's Rule for NoC Design and Its Fault-Tolerance","authors":"Daniel Greenfield, A. Banerjee, Jeong-Gun Lee, S. Moore","doi":"10.1109/NOCS.2007.26","DOIUrl":"https://doi.org/10.1109/NOCS.2007.26","url":null,"abstract":"Rent's rule is a powerful tool for exploring VLSI design and technology scaling issues. This paper applies the principles of Rent's rule to the analysis of networks-on-chip (NoC). In particular, a bandwidth-version of Rent's rule is derived, and its implications for future NoC scaling examined. Hop-length distributions for Rent's and other traffic models are then applied to analyse NoC router activity. For fault-tolerant design, a new type of router is proposed based on this analysis, and it is evaluated for mutability and its impact on congestion by further use of the hop-length distributions. It is shown that the choice of traffic model has a significant impact on scaling behaviour, design and fault-tolerant analysis","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"486 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122745150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Power and Energy Exploration of Network-on-Chip Architectures","authors":"A. Banerjee, R. Mullins, S. Moore","doi":"10.1109/NOCS.2007.6","DOIUrl":"https://doi.org/10.1109/NOCS.2007.6","url":null,"abstract":"In this study, we analyse the move towards networks-on-chips from an energy perspective by accurately modelling a circuit-switched router, a wormhole router and a speculative virtual-channel router in a 90nm CMOS process. All the routers are shown to dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data-path. This leads to the key result that, if this trend continues, the energy cost of more elaborate control would not be vast, making it easier to justify. Given effective clock-gating, this additional energy is also shown to be more or less independent of network congestion. Accurate speed and area metrics are also reported for the networks, which would allow a more complete comparison to be made across the NoC architectural space considered","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124682406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wein-Tsung Shen, Chih-Hao Chao, Yu-Kuang Lien, A. Wu
{"title":"A New Binomial Mapping and Optimization Algorithm for Reduced-Complexity Mesh-Based On-Chip Network","authors":"Wein-Tsung Shen, Chih-Hao Chao, Yu-Kuang Lien, A. Wu","doi":"10.1109/NOCS.2007.5","DOIUrl":"https://doi.org/10.1109/NOCS.2007.5","url":null,"abstract":"This paper presents an efficient binomial IP mapping and optimization algorithm (BMAP) to reduce the hardware cost of on-chip network (OCN) infrastructure. The complexity of BMAP is O(N2log(N)). Based on our OCN system synthesis flow, the proposed algorithm provides more economic network component mapping in comparison with traditional OCN mapping algorithm. The experimental result shows total traffic on network is reduced by 37% and average network hop count is reduced by 46%. With further optimization, the hardware efficiency is enhanced therefore the total hardware cost of network infrastructure is reduced to 51%~85%","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114668225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study of NoC Exit Strategies","authors":"Mikael Millberg, A. Jantsch","doi":"10.1109/NOCS.2007.7","DOIUrl":"https://doi.org/10.1109/NOCS.2007.7","url":null,"abstract":"The throughput of a network is limited due to several interacting components. Analysing simulation results made it clear that the component that was worth attacking was the exit bandwidth between the network and the connected resources. The obvious approach is to increase this bandwidth; the benefit is a higher throughput of the network and a significant lowering of the buffer requirements at the entry points of the network; this because worst case scenarios now happens at a higher injection rate. The result we present shows significant differences in throughput as well as in average and worst case latency","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124184193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
George Michelogiannakis, D. Pnevmatikatos, M. Katevenis
{"title":"Approaching Ideal NoC Latency with Pre-Configured Routes","authors":"George Michelogiannakis, D. Pnevmatikatos, M. Katevenis","doi":"10.1109/NOCS.2007.10","DOIUrl":"https://doi.org/10.1109/NOCS.2007.10","url":null,"abstract":"In multi-core ASICs, processors and other compute engines need to communicate with memory blocks and other cores with latency as close as possible to the ideal of a direct buffered wire. However, current state of the art networks-on-chip (NoCs) suffer, at best, latency of one clock cycle per hop. We investigate the design of a NoC that offers close to the ideal latency in some preferred, run-time configurable paths. Processors and other compute engines may perform network reconfiguration to guarantee low latency over different sets of paths as needed. Flits in non-preferred paths are given lower priority than flits in preferred ones, and suffer a delay of one clock cycle per hop when there is no contention. To achieve our goal, we use the \"mad-postman\" technique: every incoming flit is eagerly (i.e. speculatively) forwarded to the input's preferred output, if any. This is accomplished with the mere delay of a single pre-enabled tri-state driver. We later check if that decision was correct, and if not, we forward the flit to the proper output. Incorrectly forwarded flits are classified as dead and eliminated in later hops. We use a 2D mesh topology tailored for processor-memory communication, and a modified version of XY routing that remains deadlock-free. Performance gains are significant and can be proven greatly useful in other application domains as well","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122450674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"QNoC Asynchronous Router with Dynamic Virtual Channel Allocation","authors":"R. Dobkin, R. Ginosar, I. Cidon","doi":"10.1109/NOCS.2007.36","DOIUrl":"https://doi.org/10.1109/NOCS.2007.36","url":null,"abstract":"An asynchronous router for quality-of service NoC is presented. It combines multiple service levels (SL) with multiple equal-priority virtual channels (VC) within each level. The VCs are assigned dynamically per each link A different number of VCs may be assigned to each SL and per each link The router employs fast arbitration schemes to minimize latency","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133754126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}