{"title":"A Gather Accelerator for GNNs on FPGA Platform","authors":"W. Yuan, Teng Tian, Huawen Liang, Xi Jin","doi":"10.1109/ICPADS53394.2021.00015","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. GNNs mainly include two phases with different execution patterns. The Gather phase, depends on the structure of the graph, presenting a sparse and irregular execution pattern. The Apply phase, acts like other neural networks, showing a dense and regular execution pattern. It is challenging to accelerate GNNs, due to irregular data communication to gather information within the graph. To address this challenge, hardware acceleration for Gather phase is critical. The purpose of this research is to design and implement an FPGA-based accelerator for Gather phase. It achieves excellent performance on acceleration and energy efficiency. Evaluation is performed using a Xilinx VCU128 FPGA with three commonly-used datasets. Compared to the state-of-the-art software framework running on Intel Xeon CPU and NVIDIA P100 GPU, our work achieves on average 101.28× speedup with 75.27× dynamic energy reduction and average 12.27× speedup with 45.56× dynamic energy reduction, respectively.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS53394.2021.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Graph Neural Networks (GNNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. GNNs mainly include two phases with different execution patterns. The Gather phase, depends on the structure of the graph, presenting a sparse and irregular execution pattern. The Apply phase, acts like other neural networks, showing a dense and regular execution pattern. It is challenging to accelerate GNNs, due to irregular data communication to gather information within the graph. To address this challenge, hardware acceleration for Gather phase is critical. The purpose of this research is to design and implement an FPGA-based accelerator for Gather phase. It achieves excellent performance on acceleration and energy efficiency. Evaluation is performed using a Xilinx VCU128 FPGA with three commonly-used datasets. Compared to the state-of-the-art software framework running on Intel Xeon CPU and NVIDIA P100 GPU, our work achieves on average 101.28× speedup with 75.27× dynamic energy reduction and average 12.27× speedup with 45.56× dynamic energy reduction, respectively.