{"title":"Efficient Inference of Graph Neural Networks Using Local Sensitive Hash","authors":"Tao Liu;Peng Li;Zhou Su;Mianxiong Dong","doi":"10.1109/TSUSC.2024.3351282","DOIUrl":null,"url":null,"abstract":"Graph neural networks (GNNs) have attracted significant research attention because of their impressive capability in dealing with graph-structure data, such as energy networks, that are crucial for sustainable computing. We find that the communication of data loading from main memory to GPUs is the main bottleneck of GNN inference because of redundant data loading. In this paper, we propose RAIN, an efficient GNN inference system for graph learning. There are two key designs. First, we explore the opportunity of conducting similar inference batches sequentially and reusing repeated nodes among adjacent batches to reduce redundant data loading. This method requires reordering the batches based on their similarity. However, comparing the similarity across a large number of inference batches is a difficult task with a high computational cost. Thus, we propose a local sensitive hash (LSH)-based clustering scheme to group similar batches together quickly without pair-wise comparison. Second, RAIN contains an efficient adaptive sampling strategy, allowing users to sample target nodes’ neighbors according to their degree. The number of sampled neighbors is proportional to the size of the node's degree. We conduct extensive experiments with various baselines. RAIN can achieve up to 6.8X acceleration, and the accuracy decrease is smaller than 0.1%.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"9 3","pages":"548-558"},"PeriodicalIF":3.0000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Sustainable Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10384786/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Graph neural networks (GNNs) have attracted significant research attention because of their impressive capability in dealing with graph-structure data, such as energy networks, that are crucial for sustainable computing. We find that the communication of data loading from main memory to GPUs is the main bottleneck of GNN inference because of redundant data loading. In this paper, we propose RAIN, an efficient GNN inference system for graph learning. There are two key designs. First, we explore the opportunity of conducting similar inference batches sequentially and reusing repeated nodes among adjacent batches to reduce redundant data loading. This method requires reordering the batches based on their similarity. However, comparing the similarity across a large number of inference batches is a difficult task with a high computational cost. Thus, we propose a local sensitive hash (LSH)-based clustering scheme to group similar batches together quickly without pair-wise comparison. Second, RAIN contains an efficient adaptive sampling strategy, allowing users to sample target nodes’ neighbors according to their degree. The number of sampled neighbors is proportional to the size of the node's degree. We conduct extensive experiments with various baselines. RAIN can achieve up to 6.8X acceleration, and the accuracy decrease is smaller than 0.1%.