Yung-Chang Chang, C. Chiu, Shih-Yin Lin, Chung-Kai Liu
{"title":"基于备用路由器的容错NoC架构设计与分析","authors":"Yung-Chang Chang, C. Chiu, Shih-Yin Lin, Chung-Kai Liu","doi":"10.1109/ASPDAC.2011.5722228","DOIUrl":null,"url":null,"abstract":"The aggressive advent in VLSI manufacturing technology has made dramatic impacts on the dependability of devices and interconnects. In the modern manycore system, mesh based Networks-on-Chip (NoC) is widely adopted as on chip communication infrastructure. It is critical to provide an effective fault tolerance scheme on mesh based NoC. A faulty router or broken link isolates a well functional processing element (PE). Also, a set of faulty routers form faulty regions which may break down the whole design. To address these issues, we propose an innovative router-level fault tolerance scheme with spare routers which is different from the traditional microarchitecture-level approach. The spare routers not only provide redundancies but also diversify connection paths between adjacent routers. To exploit these valuable resources on fault tolerant capabilities, two configuration algorithms are demonstrated. One is shift-and-replace-allocation (SARA) and the other is defect-awareness-path-allocation (DAPA) that takes advantage of path diversity in our architecture. The proposed design is transparent to any routing algorithm since the output topology is consistent to the original mesh. Experimental results show that our scheme has remarkable improvements on fault tolerant metrics including reliability, mean time to failure (MTTF), and yield. In addition, the performance of spare router increases with the growth of NoC size but the relative connection cost decreases at the same time. This rare and valuable characteristic makes our solution suitable for large scale NoC design.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"91","resultStr":"{\"title\":\"On the design and analysis of fault tolerant NoC architecture using spare routers\",\"authors\":\"Yung-Chang Chang, C. Chiu, Shih-Yin Lin, Chung-Kai Liu\",\"doi\":\"10.1109/ASPDAC.2011.5722228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aggressive advent in VLSI manufacturing technology has made dramatic impacts on the dependability of devices and interconnects. In the modern manycore system, mesh based Networks-on-Chip (NoC) is widely adopted as on chip communication infrastructure. It is critical to provide an effective fault tolerance scheme on mesh based NoC. A faulty router or broken link isolates a well functional processing element (PE). Also, a set of faulty routers form faulty regions which may break down the whole design. To address these issues, we propose an innovative router-level fault tolerance scheme with spare routers which is different from the traditional microarchitecture-level approach. The spare routers not only provide redundancies but also diversify connection paths between adjacent routers. To exploit these valuable resources on fault tolerant capabilities, two configuration algorithms are demonstrated. One is shift-and-replace-allocation (SARA) and the other is defect-awareness-path-allocation (DAPA) that takes advantage of path diversity in our architecture. The proposed design is transparent to any routing algorithm since the output topology is consistent to the original mesh. Experimental results show that our scheme has remarkable improvements on fault tolerant metrics including reliability, mean time to failure (MTTF), and yield. In addition, the performance of spare router increases with the growth of NoC size but the relative connection cost decreases at the same time. This rare and valuable characteristic makes our solution suitable for large scale NoC design.\",\"PeriodicalId\":316253,\"journal\":{\"name\":\"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-01-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"91\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASPDAC.2011.5722228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASPDAC.2011.5722228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the design and analysis of fault tolerant NoC architecture using spare routers
The aggressive advent in VLSI manufacturing technology has made dramatic impacts on the dependability of devices and interconnects. In the modern manycore system, mesh based Networks-on-Chip (NoC) is widely adopted as on chip communication infrastructure. It is critical to provide an effective fault tolerance scheme on mesh based NoC. A faulty router or broken link isolates a well functional processing element (PE). Also, a set of faulty routers form faulty regions which may break down the whole design. To address these issues, we propose an innovative router-level fault tolerance scheme with spare routers which is different from the traditional microarchitecture-level approach. The spare routers not only provide redundancies but also diversify connection paths between adjacent routers. To exploit these valuable resources on fault tolerant capabilities, two configuration algorithms are demonstrated. One is shift-and-replace-allocation (SARA) and the other is defect-awareness-path-allocation (DAPA) that takes advantage of path diversity in our architecture. The proposed design is transparent to any routing algorithm since the output topology is consistent to the original mesh. Experimental results show that our scheme has remarkable improvements on fault tolerant metrics including reliability, mean time to failure (MTTF), and yield. In addition, the performance of spare router increases with the growth of NoC size but the relative connection cost decreases at the same time. This rare and valuable characteristic makes our solution suitable for large scale NoC design.