João N. F. Alves, Samir Moustafa, Siegfried Benkner, Alexandre P. Francisco, Wilfried N. Gansterer, Luís M. S. Russo
{"title":"Accelerating Graph Neural Networks with a Novel Matrix Compression Format","authors":"João N. F. Alves, Samir Moustafa, Siegfried Benkner, Alexandre P. Francisco, Wilfried N. Gansterer, Luís M. S. Russo","doi":"arxiv-2409.02208","DOIUrl":null,"url":null,"abstract":"The inference and training stages of Graph Neural Networks (GNNs) are often\ndominated by the time required to compute a long sequence of matrix\nmultiplications between the sparse graph adjacency matrix and its embedding. To\naccelerate these stages, we first propose the Compressed Binary Matrix (CBM)\nstorage format to succinctly represent the binary adjacency matrix of an\nunweighted graph. Then, we show how to generalize this representation to\nnormalized adjacency matrices of unweighted graphs which arise in the context\nof GNNs. Finally, we develop efficient matrix multiplication kernels based on\nthis compressed representation. The matrix multiplication kernels proposed in\nthis work never require more scalar operations than classic sparse matrix\nmultiplication algorithms. Experimental evaluation shows that the matrix\nmultiplication strategies proposed outperform the current state-of-the-art\nimplementations provided by Intel MKL, achieving speedups close to 5$\\times$.\nFurthermore, our optimized matrix-multiplication strategies accelerated the\ninference time of a GNN by up to $3\\times$.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The inference and training stages of Graph Neural Networks (GNNs) are often
dominated by the time required to compute a long sequence of matrix
multiplications between the sparse graph adjacency matrix and its embedding. To
accelerate these stages, we first propose the Compressed Binary Matrix (CBM)
storage format to succinctly represent the binary adjacency matrix of an
unweighted graph. Then, we show how to generalize this representation to
normalized adjacency matrices of unweighted graphs which arise in the context
of GNNs. Finally, we develop efficient matrix multiplication kernels based on
this compressed representation. The matrix multiplication kernels proposed in
this work never require more scalar operations than classic sparse matrix
multiplication algorithms. Experimental evaluation shows that the matrix
multiplication strategies proposed outperform the current state-of-the-art
implementations provided by Intel MKL, achieving speedups close to 5$\times$.
Furthermore, our optimized matrix-multiplication strategies accelerated the
inference time of a GNN by up to $3\times$.