Semi-adaptive distributed approach for triplet-based architecture inter-core communication Network-on-Chip

IF 2.5 3区工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Integration-The Vlsi Journal Pub Date : 2025-08-04 DOI:10.1016/j.vlsi.2025.102489

Karim Soliman, Chunfeng Li, Shi Feng

{"title":"Semi-adaptive distributed approach for triplet-based architecture inter-core communication Network-on-Chip","authors":"Karim Soliman, Chunfeng Li, Shi Feng","doi":"10.1016/j.vlsi.2025.102489","DOIUrl":null,"url":null,"abstract":"<div><div>Network-on-Chip (NoC) architectures offer significant performance improvements over traditional bus-based systems. However, as NoC designs become more complex, congestion within channels and buffers can degrade performance. Efficient routing algorithms are essential to mitigate congestion and optimize overall NoC performance. This study examines Triplet-Based Architecture (TriBA) architecture and its baseline deterministic shortest-path routing algorithm. The deterministic nature of this algorithm, combined with TriBA’s inherent characteristics, may exacerbate congestion and lead to routing deadlocks, particularly in high-traffic nodes. This work introduces a novel two-stage semi-adaptive routing algorithm to address congestion within TriBA-NoC. The proposed approach leverages congestion levels within downstream buffers of TriBA sub-level apex vertices as an additional routing metric solely at the source node. The primary goal of the proposed strategy is to alleviate congestion, specifically at hot nodes, which in turn leads to reduced communication latency, shorter queuing times in downstream buffers, lower power consumption, and an enhancement in overall network throughput. Comprehensive simulations utilizing gem5-HeteroGarnet have substantiated the effectiveness of the proposed semi-adaptive routing for TriBA-NoC. The approach yields a significant decrease in latency (up to 31.71%), and a decline in buffer queuing time, with reductions reaching 43.14%. Additionally, it enhances throughput by 6.28% and lowers downstream buffer power consumption by an average of 12.15% across various traffic patterns when compared to baseline algorithms. These improvements are evident in practical workloads, as demonstrated by simulations using the PARSEC benchmark suite, which reveal latency reductions between 0.91% and 45.86%. Nonetheless, these performance enhancements are accompanied by a modest power overhead, estimated at approximately 3.94%.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"105 ","pages":"Article 102489"},"PeriodicalIF":2.5000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integration-The Vlsi Journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167926025001464","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Network-on-Chip (NoC) architectures offer significant performance improvements over traditional bus-based systems. However, as NoC designs become more complex, congestion within channels and buffers can degrade performance. Efficient routing algorithms are essential to mitigate congestion and optimize overall NoC performance. This study examines Triplet-Based Architecture (TriBA) architecture and its baseline deterministic shortest-path routing algorithm. The deterministic nature of this algorithm, combined with TriBA’s inherent characteristics, may exacerbate congestion and lead to routing deadlocks, particularly in high-traffic nodes. This work introduces a novel two-stage semi-adaptive routing algorithm to address congestion within TriBA-NoC. The proposed approach leverages congestion levels within downstream buffers of TriBA sub-level apex vertices as an additional routing metric solely at the source node. The primary goal of the proposed strategy is to alleviate congestion, specifically at hot nodes, which in turn leads to reduced communication latency, shorter queuing times in downstream buffers, lower power consumption, and an enhancement in overall network throughput. Comprehensive simulations utilizing gem5-HeteroGarnet have substantiated the effectiveness of the proposed semi-adaptive routing for TriBA-NoC. The approach yields a significant decrease in latency (up to 31.71%), and a decline in buffer queuing time, with reductions reaching 43.14%. Additionally, it enhances throughput by 6.28% and lowers downstream buffer power consumption by an average of 12.15% across various traffic patterns when compared to baseline algorithms. These improvements are evident in practical workloads, as demonstrated by simulations using the PARSEC benchmark suite, which reveal latency reductions between 0.91% and 45.86%. Nonetheless, these performance enhancements are accompanied by a modest power overhead, estimated at approximately 3.94%.

查看原文本刊更多论文

基于三晶片架构的半自适应分布式核间通信方法

片上网络（NoC）体系结构比传统的基于总线的系统提供了显著的性能改进。然而，随着NoC设计变得越来越复杂，通道和缓冲区内的拥塞会降低性能。有效的路由算法对于缓解拥塞和优化整体NoC性能至关重要。本研究探讨基于三重体的架构（TriBA）架构及其基线确定性最短路径路由算法。该算法的确定性与TriBA的固有特性相结合，可能会加剧拥塞并导致路由死锁，特别是在高流量节点中。本文介绍了一种新的两阶段半自适应路由算法来解决TriBA-NoC中的拥塞问题。所提出的方法利用TriBA子级顶点的下游缓冲区内的拥塞水平作为源节点上的附加路由度量。所提出的策略的主要目标是缓解拥塞，特别是在热节点上，这反过来会减少通信延迟，缩短下游缓冲区的排队时间，降低功耗，并增强整体网络吞吐量。利用gem5-HeteroGarnet进行的综合仿真验证了所提出的TriBA-NoC半自适应路由的有效性。该方法显著降低了延迟（高达31.71%），并降低了缓冲区排队时间，降幅达到43.14%。此外，与基线算法相比，它在各种流量模式下将吞吐量提高了6.28%，并将下游缓冲区功耗平均降低了12.15%。这些改进在实际工作负载中非常明显，正如使用PARSEC基准测试套件的模拟所证明的那样，延迟减少了0.91%到45.86%。尽管如此，这些性能增强伴随着适度的电力开销，估计约为3.94%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Integration-The Vlsi Journal 工程技术-工程：电子与电气

CiteScore

3.80

自引率

5.30%

发文量

107

审稿时长

6 months

期刊介绍： Integration''s aim is to cover every aspect of the VLSI area, with an emphasis on cross-fertilization between various fields of science, and the design, verification, test and applications of integrated circuits and systems, as well as closely related topics in process and device technologies. Individual issues will feature peer-reviewed tutorials and articles as well as reviews of recent publications. The intended coverage of the journal can be assessed by examining the following (non-exclusive) list of topics: Specification methods and languages; Analog/Digital Integrated Circuits and Systems; VLSI architectures; Algorithms, methods and tools for modeling, simulation, synthesis and verification of integrated circuits and systems of any complexity; Embedded systems; High-level synthesis for VLSI systems; Logic synthesis and finite automata; Testing, design-for-test and test generation algorithms; Physical design; Formal verification; Algorithms implemented in VLSI systems; Systems engineering; Heterogeneous systems.