发布求助

文献互助智能选刊最新文献

面向图神经网络的统一位稀疏感知加速器BEAST-GNN

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers Pub Date : 2025-04-08 DOI:10.1109/TC.2025.3558587

Yunzhen Luo;Yan Ding;Zhuo Tang;Keqin Li;Kenli Li;Chubo Liu

{"title":"面向图神经网络的统一位稀疏感知加速器BEAST-GNN","authors":"Yunzhen Luo;Yan Ding;Zhuo Tang;Keqin Li;Kenli Li;Chubo Liu","doi":"10.1109/TC.2025.3558587","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) excel in processing graph-structured data, making them attractive and promising for tasks such as recommender systems and traffic forecasting. However, GNNs’ irregular computational patterns limit their ability to achieve low latency and high energy efficiency, particularly in edge computing environments. Current GNN accelerators predominantly focus on value sparsity, underutilizing the potential performance gains from bit-level sparsity. However, applying existing bit-serial accelerators to GNNs presents several challenges. These challenges arise from GNNs’ more complex data flow compared to conventional neural networks, as well as difficulties in data localization and load balancing with irregular graph data. To address these challenges, we propose BEAST-GNN, a bit-serial GNN accelerator that fully exploits bit-level sparsity. BEAST-GNN introduces streamlined sparse-dense bit matrix multiplication for optimized data flow, a column-overlapped graph partitioning method to enhance data locality by reducing memory access inefficiencies, and a sparse bit-counting strategy to ensure balanced workload distribution across processing elements (PEs). Compared to state-of-the-art accelerators, including HyGCN, GCNAX, Laconic, GROW, I-GCN, SGCN, and MEGA, BEAST-GNN achieves speedups of 21.7<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 6.4<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 10.5<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 3.7<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 4.0<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 3.3<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, and 1.4<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula> respectively, while also reducing DRAM access by 36.3<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 7.9<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 6.6<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 3.9<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 5.38<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, 3.37<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>, and 1.44<inline-formula><tex-math>$\\boldsymbol{\\times}$</tex-math></inline-formula>. Additionally, BEAST-GNN consumes only 4.8%, 12.4%, 19.6%, 27.7%, 17.0%, 26.5%, and 82.8% of the energy required by these architectures.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 7","pages":"2402-2416"},"PeriodicalIF":3.8000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BEAST-GNN: A United Bit Sparsity-Aware Accelerator for Graph Neural Networks\",\"authors\":\"Yunzhen Luo;Yan Ding;Zhuo Tang;Keqin Li;Kenli Li;Chubo Liu\",\"doi\":\"10.1109/TC.2025.3558587\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph Neural Networks (GNNs) excel in processing graph-structured data, making them attractive and promising for tasks such as recommender systems and traffic forecasting. However, GNNs’ irregular computational patterns limit their ability to achieve low latency and high energy efficiency, particularly in edge computing environments. Current GNN accelerators predominantly focus on value sparsity, underutilizing the potential performance gains from bit-level sparsity. However, applying existing bit-serial accelerators to GNNs presents several challenges. These challenges arise from GNNs’ more complex data flow compared to conventional neural networks, as well as difficulties in data localization and load balancing with irregular graph data. To address these challenges, we propose BEAST-GNN, a bit-serial GNN accelerator that fully exploits bit-level sparsity. BEAST-GNN introduces streamlined sparse-dense bit matrix multiplication for optimized data flow, a column-overlapped graph partitioning method to enhance data locality by reducing memory access inefficiencies, and a sparse bit-counting strategy to ensure balanced workload distribution across processing elements (PEs). Compared to state-of-the-art accelerators, including HyGCN, GCNAX, Laconic, GROW, I-GCN, SGCN, and MEGA, BEAST-GNN achieves speedups of 21.7<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 6.4<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 10.5<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 3.7<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 4.0<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 3.3<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, and 1.4<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula> respectively, while also reducing DRAM access by 36.3<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 7.9<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 6.6<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 3.9<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 5.38<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, 3.37<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>, and 1.44<inline-formula><tex-math>$\\\\boldsymbol{\\\\times}$</tex-math></inline-formula>. Additionally, BEAST-GNN consumes only 4.8%, 12.4%, 19.6%, 27.7%, 17.0%, 26.5%, and 82.8% of the energy required by these architectures.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"74 7\",\"pages\":\"2402-2416\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10955485/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10955485/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

图神经网络（gnn）擅长处理图结构数据，这使得它们在推荐系统和交通预测等任务中具有吸引力和前景。然而，gnn的不规则计算模式限制了它们实现低延迟和高能效的能力，特别是在边缘计算环境中。当前的GNN加速器主要关注值稀疏性，未充分利用比特级稀疏性带来的潜在性能收益。然而，将现有的位串行加速器应用于gnn存在一些挑战。这些挑战来自于与传统神经网络相比，gnn的数据流更复杂，以及在不规则图形数据的数据本地化和负载平衡方面的困难。为了解决这些挑战，我们提出了BEAST-GNN，这是一个位串行GNN加速器，充分利用了位级稀疏性。BEAST-GNN为优化数据流引入了精简的稀疏密集位矩阵乘法，通过减少内存访问效率低下来增强数据局部性的列重叠图划分方法，以及确保跨处理元素（pe）平衡工作负载分布的稀疏位计数策略。与HyGCN、GCNAX、Laconic、GROW、I-GCN、SGCN和MEGA等最先进的加速器相比，beass - gnn分别实现了21.7美元\boldsymbol{\times}$、6.4美元\boldsymbol{\times}$、10.5美元\boldsymbol{\times}$、3.7美元\boldsymbol{\times}$、4.0美元\boldsymbol{\times}$、3.3美元\boldsymbol{\times}$和1.4美元\boldsymbol{\times}$的速度提升，同时还将DRAM存取量降低了36.3美元\boldsymbol{\times}$、7.9美元\boldsymbol{\times}$、6.6美元\boldsymbol{\times}$、3.9美元\boldsymbol{\times}$。5.38$\boldsymbol{\times}$、3.37$\boldsymbol{\times}$和1.44$\boldsymbol{\times}$。此外，BEAST-GNN仅消耗这些架构所需能量的4.8%,12.4%,19.6%,27.7%,17.0%，26.5%和82.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BEAST-GNN: A United Bit Sparsity-Aware Accelerator for Graph Neural Networks

Graph Neural Networks (GNNs) excel in processing graph-structured data, making them attractive and promising for tasks such as recommender systems and traffic forecasting. However, GNNs’ irregular computational patterns limit their ability to achieve low latency and high energy efficiency, particularly in edge computing environments. Current GNN accelerators predominantly focus on value sparsity, underutilizing the potential performance gains from bit-level sparsity. However, applying existing bit-serial accelerators to GNNs presents several challenges. These challenges arise from GNNs’ more complex data flow compared to conventional neural networks, as well as difficulties in data localization and load balancing with irregular graph data. To address these challenges, we propose BEAST-GNN, a bit-serial GNN accelerator that fully exploits bit-level sparsity. BEAST-GNN introduces streamlined sparse-dense bit matrix multiplication for optimized data flow, a column-overlapped graph partitioning method to enhance data locality by reducing memory access inefficiencies, and a sparse bit-counting strategy to ensure balanced workload distribution across processing elements (PEs). Compared to state-of-the-art accelerators, including HyGCN, GCNAX, Laconic, GROW, I-GCN, SGCN, and MEGA, BEAST-GNN achieves speedups of 21.7

$\boldsymbol{\times}$

, 6.4

$\boldsymbol{\times}$

, 10.5

$\boldsymbol{\times}$

, 3.7

$\boldsymbol{\times}$

, 4.0

$\boldsymbol{\times}$

, 3.3

$\boldsymbol{\times}$

, and 1.4

$\boldsymbol{\times}$

respectively, while also reducing DRAM access by 36.3

$\boldsymbol{\times}$

, 7.9

$\boldsymbol{\times}$

, 6.6

$\boldsymbol{\times}$

, 3.9

$\boldsymbol{\times}$

, 5.38

$\boldsymbol{\times}$

, 3.37

$\boldsymbol{\times}$

, and 1.44

$\boldsymbol{\times}$

. Additionally, BEAST-GNN consumes only 4.8%, 12.4%, 19.6%, 27.7%, 17.0%, 26.5%, and 82.8% of the energy required by these architectures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Computers 工程技术-工程：电子与电气

CiteScore

6.60

自引率

5.40%

发文量

199

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.