Fault-tolerant mesh for 3D network on chip

Kai-Yang Hsieh, Bo-Chuan Cheng, Ruei-Ting Gu, Katherine Shu-Min Li
{"title":"Fault-tolerant mesh for 3D network on chip","authors":"Kai-Yang Hsieh, Bo-Chuan Cheng, Ruei-Ting Gu, Katherine Shu-Min Li","doi":"10.1109/IMPACT.2011.6117292","DOIUrl":null,"url":null,"abstract":"3D Mesh NoCs (Network on Chips) are one of the best approaches to solve the complexity of interconnect structures in SoCs (System on Chips) which leads to lower yield. In this paper, we present a Mesh-based scheme for 3D NoCs with fault-tolerance that helps increasing chips' reliability and yield. There are several phases for this scheme. The phase I transforms a 2D NoC into an optimized 3D NoC under the constraints of area, routing length, temperature, performance and etc. Then, we optimize the I/O placement to get the best routing between I/O pads and all cores by clustering the placement of each core and reassign the tier sequence to minimize the number of TSVs. Finally, we build up the Mesh topology for each tier with squaring the maximum number of cores. For example, we need a 4×4 Mesh if the maximum cores in each tier are 15. Once the 3D Mesh topology is ready, we are going to set up the routing scheme that provides the minimum number of routers and the minimum routing latency in phase II. We also have a routing scheme to control the data flow and distribute the communication overhead. Phase III is to search the replacement routing paths. There will be at least 2 paths for each connection. The more replacement paths we found, the more faults can be tolerated and more computing time will be needed. We verify the fault-tolerant 3D Mesh NoC in phase IV. First, we randomly insert some faults to verify if the NoC is still working. We can get the maximum number of faults to be tolerated by increasing the number of faults until the system crash in the second step. The verification may need hundreds of times to get the approximate maximum faults. If the fault toleration is not good enough, we can go back to phase III to search more replacements. Experimental results show to this verified fault-tolerant 3D Mesh scheme to be effective and efficient. This scheme can efficiently transform a complex 2D NoC into 3D fault-tolerant Mesh NoC according to the user-defined constraints and also provides the tradeoff analysis between the tolerance and the search time of the effective replacement paths.","PeriodicalId":6360,"journal":{"name":"2011 6th International Microsystems, Packaging, Assembly and Circuits Technology Conference (IMPACT)","volume":"1 1","pages":"202-205"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 6th International Microsystems, Packaging, Assembly and Circuits Technology Conference (IMPACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMPACT.2011.6117292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

3D Mesh NoCs (Network on Chips) are one of the best approaches to solve the complexity of interconnect structures in SoCs (System on Chips) which leads to lower yield. In this paper, we present a Mesh-based scheme for 3D NoCs with fault-tolerance that helps increasing chips' reliability and yield. There are several phases for this scheme. The phase I transforms a 2D NoC into an optimized 3D NoC under the constraints of area, routing length, temperature, performance and etc. Then, we optimize the I/O placement to get the best routing between I/O pads and all cores by clustering the placement of each core and reassign the tier sequence to minimize the number of TSVs. Finally, we build up the Mesh topology for each tier with squaring the maximum number of cores. For example, we need a 4×4 Mesh if the maximum cores in each tier are 15. Once the 3D Mesh topology is ready, we are going to set up the routing scheme that provides the minimum number of routers and the minimum routing latency in phase II. We also have a routing scheme to control the data flow and distribute the communication overhead. Phase III is to search the replacement routing paths. There will be at least 2 paths for each connection. The more replacement paths we found, the more faults can be tolerated and more computing time will be needed. We verify the fault-tolerant 3D Mesh NoC in phase IV. First, we randomly insert some faults to verify if the NoC is still working. We can get the maximum number of faults to be tolerated by increasing the number of faults until the system crash in the second step. The verification may need hundreds of times to get the approximate maximum faults. If the fault toleration is not good enough, we can go back to phase III to search more replacements. Experimental results show to this verified fault-tolerant 3D Mesh scheme to be effective and efficient. This scheme can efficiently transform a complex 2D NoC into 3D fault-tolerant Mesh NoC according to the user-defined constraints and also provides the tradeoff analysis between the tolerance and the search time of the effective replacement paths.
片上三维网络的容错网格
3D Mesh noc(片上网络)是解决片上系统互连结构复杂性导致成品率降低的最佳方法之一。在本文中,我们提出了一种基于网格的三维noc容错方案,有助于提高芯片的可靠性和良率。这个方案有几个阶段。在面积、布线长度、温度、性能等因素的约束下,将2D NoC转换为优化的3D NoC。然后,我们优化I/O布局,通过对每个内核的布局进行聚类,并重新分配层序列以最小化tsv的数量,从而获得I/O垫和所有内核之间的最佳路由。最后,我们为每个层构建网格拓扑,并将最大核数平方。例如,如果每层的最大核数是15,我们需要一个4×4 Mesh。一旦3D Mesh拓扑准备好了,我们将在第二阶段设置路由方案,提供最小数量的路由器和最小的路由延迟。我们还有一个路由方案来控制数据流和分配通信开销。第三阶段是搜索替换路由路径。每个连接至少有2条路径。我们发现的替换路径越多,可以容忍的故障就越多,所需的计算时间也就越多。我们在第四阶段验证了容错3D Mesh NoC。首先,我们随机插入一些故障来验证NoC是否仍在工作。在第二步中,我们可以通过增加故障数量,直到系统崩溃,从而获得可容忍的最大故障数量。验证可能需要数百次才能得到近似的最大故障。如果容错性不够好,我们可以回到第三阶段寻找更多的替代品。实验结果表明,该方案是有效的、高效的。该方案可以根据用户定义的约束条件,将复杂的二维网格NoC有效地转换为三维容错网格NoC,并提供容错度与有效替换路径搜索时间之间的权衡分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信