NDPGNN：用于 GNN 训练和推理加速的近数据处理架构

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2024-11-06 DOI:10.1109/TCAD.2024.3446871

Haoyang Wang;Shengbing Zhang;Xiaoya Fan;Zhao Yang;Meng Zhang

{"title":"NDPGNN：用于 GNN 训练和推理加速的近数据处理架构","authors":"Haoyang Wang;Shengbing Zhang;Xiaoya Fan;Zhao Yang;Meng Zhang","doi":"10.1109/TCAD.2024.3446871","DOIUrl":null,"url":null,"abstract":"Graph neural networks (GNNs) require a large number of fine-grained memory accesses, which results in inefficient use of bandwidth resources. In this article, we introduce a near-data processing architecture tailored for GNN acceleration, named NDPGNN. NDPGNN provides different operating modes to meet the acceleration needs of various GNN frameworks while ensuring the configurability and scalability of the system. NDPGNN takes advantage of data locality characteristics to repeatedly distribute and utilize data, thereby reducing memory access requirements, and further improving memory access efficiency by combining a subgraph sparse node scheduling strategy with intermediate result reuse. We use data packaging to provide a higher effective data ratio for long-distance data transmission, thereby improving the utilization of the system’s limited bandwidth resources. Compared with the previous method, NDPGNN brings 5.68 times improvement in system performance while reducing energy consumption overhead by 8.49 times.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"3997-4008"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NDPGNN: A Near-Data Processing Architecture for GNN Training and Inference Acceleration\",\"authors\":\"Haoyang Wang;Shengbing Zhang;Xiaoya Fan;Zhao Yang;Meng Zhang\",\"doi\":\"10.1109/TCAD.2024.3446871\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph neural networks (GNNs) require a large number of fine-grained memory accesses, which results in inefficient use of bandwidth resources. In this article, we introduce a near-data processing architecture tailored for GNN acceleration, named NDPGNN. NDPGNN provides different operating modes to meet the acceleration needs of various GNN frameworks while ensuring the configurability and scalability of the system. NDPGNN takes advantage of data locality characteristics to repeatedly distribute and utilize data, thereby reducing memory access requirements, and further improving memory access efficiency by combining a subgraph sparse node scheduling strategy with intermediate result reuse. We use data packaging to provide a higher effective data ratio for long-distance data transmission, thereby improving the utilization of the system’s limited bandwidth resources. Compared with the previous method, NDPGNN brings 5.68 times improvement in system performance while reducing energy consumption overhead by 8.49 times.\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"43 11\",\"pages\":\"3997-4008\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10745796/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745796/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

图神经网络（GNN）需要大量细粒度内存访问，导致带宽资源使用效率低下。在本文中，我们介绍了一种专为 GNN 加速定制的近数据处理架构，命名为 NDPGNN。NDPGNN 提供不同的运行模式，以满足各种 GNN 框架的加速需求，同时确保系统的可配置性和可扩展性。NDPGNN 利用数据局部性特点重复分发和利用数据，从而降低了内存访问要求，并通过将子图稀疏节点调度策略与中间结果重用相结合，进一步提高了内存访问效率。我们利用数据打包为远距离数据传输提供了更高的有效数据比率，从而提高了系统有限带宽资源的利用率。与之前的方法相比，NDPGNN 将系统性能提高了 5.68 倍，同时将能耗开销降低了 8.49 倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

NDPGNN: A Near-Data Processing Architecture for GNN Training and Inference Acceleration

Graph neural networks (GNNs) require a large number of fine-grained memory accesses, which results in inefficient use of bandwidth resources. In this article, we introduce a near-data processing architecture tailored for GNN acceleration, named NDPGNN. NDPGNN provides different operating modes to meet the acceleration needs of various GNN frameworks while ensuring the configurability and scalability of the system. NDPGNN takes advantage of data locality characteristics to repeatedly distribute and utilize data, thereby reducing memory access requirements, and further improving memory access efficiency by combining a subgraph sparse node scheduling strategy with intermediate result reuse. We use data packaging to provide a higher effective data ratio for long-distance data transmission, thereby improving the utilization of the system’s limited bandwidth resources. Compared with the previous method, NDPGNN brings 5.68 times improvement in system performance while reducing energy consumption overhead by 8.49 times.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 工程技术-工程：电子与电气

CiteScore

5.60

自引率

13.80%

发文量

500

审稿时长

7 months

期刊介绍： The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.