George Michelogiannakis, D. Pnevmatikatos, M. Katevenis
{"title":"用预配置路由逼近理想NoC延迟","authors":"George Michelogiannakis, D. Pnevmatikatos, M. Katevenis","doi":"10.1109/NOCS.2007.10","DOIUrl":null,"url":null,"abstract":"In multi-core ASICs, processors and other compute engines need to communicate with memory blocks and other cores with latency as close as possible to the ideal of a direct buffered wire. However, current state of the art networks-on-chip (NoCs) suffer, at best, latency of one clock cycle per hop. We investigate the design of a NoC that offers close to the ideal latency in some preferred, run-time configurable paths. Processors and other compute engines may perform network reconfiguration to guarantee low latency over different sets of paths as needed. Flits in non-preferred paths are given lower priority than flits in preferred ones, and suffer a delay of one clock cycle per hop when there is no contention. To achieve our goal, we use the \"mad-postman\" technique: every incoming flit is eagerly (i.e. speculatively) forwarded to the input's preferred output, if any. This is accomplished with the mere delay of a single pre-enabled tri-state driver. We later check if that decision was correct, and if not, we forward the flit to the proper output. Incorrectly forwarded flits are classified as dead and eliminated in later hops. We use a 2D mesh topology tailored for processor-memory communication, and a modified version of XY routing that remains deadlock-free. Performance gains are significant and can be proven greatly useful in other application domains as well","PeriodicalId":132772,"journal":{"name":"First International Symposium on Networks-on-Chip (NOCS'07)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":"{\"title\":\"Approaching Ideal NoC Latency with Pre-Configured Routes\",\"authors\":\"George Michelogiannakis, D. Pnevmatikatos, M. Katevenis\",\"doi\":\"10.1109/NOCS.2007.10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In multi-core ASICs, processors and other compute engines need to communicate with memory blocks and other cores with latency as close as possible to the ideal of a direct buffered wire. However, current state of the art networks-on-chip (NoCs) suffer, at best, latency of one clock cycle per hop. We investigate the design of a NoC that offers close to the ideal latency in some preferred, run-time configurable paths. Processors and other compute engines may perform network reconfiguration to guarantee low latency over different sets of paths as needed. Flits in non-preferred paths are given lower priority than flits in preferred ones, and suffer a delay of one clock cycle per hop when there is no contention. To achieve our goal, we use the \\\"mad-postman\\\" technique: every incoming flit is eagerly (i.e. speculatively) forwarded to the input's preferred output, if any. This is accomplished with the mere delay of a single pre-enabled tri-state driver. We later check if that decision was correct, and if not, we forward the flit to the proper output. Incorrectly forwarded flits are classified as dead and eliminated in later hops. We use a 2D mesh topology tailored for processor-memory communication, and a modified version of XY routing that remains deadlock-free. Performance gains are significant and can be proven greatly useful in other application domains as well\",\"PeriodicalId\":132772,\"journal\":{\"name\":\"First International Symposium on Networks-on-Chip (NOCS'07)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"52\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"First International Symposium on Networks-on-Chip (NOCS'07)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NOCS.2007.10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"First International Symposium on Networks-on-Chip (NOCS'07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOCS.2007.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Approaching Ideal NoC Latency with Pre-Configured Routes
In multi-core ASICs, processors and other compute engines need to communicate with memory blocks and other cores with latency as close as possible to the ideal of a direct buffered wire. However, current state of the art networks-on-chip (NoCs) suffer, at best, latency of one clock cycle per hop. We investigate the design of a NoC that offers close to the ideal latency in some preferred, run-time configurable paths. Processors and other compute engines may perform network reconfiguration to guarantee low latency over different sets of paths as needed. Flits in non-preferred paths are given lower priority than flits in preferred ones, and suffer a delay of one clock cycle per hop when there is no contention. To achieve our goal, we use the "mad-postman" technique: every incoming flit is eagerly (i.e. speculatively) forwarded to the input's preferred output, if any. This is accomplished with the mere delay of a single pre-enabled tri-state driver. We later check if that decision was correct, and if not, we forward the flit to the proper output. Incorrectly forwarded flits are classified as dead and eliminated in later hops. We use a 2D mesh topology tailored for processor-memory communication, and a modified version of XY routing that remains deadlock-free. Performance gains are significant and can be proven greatly useful in other application domains as well