Proformer: a scalable graph transformer with linear complexity

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2024-12-13 DOI:10.1007/s10489-024-06065-x

Zhu Liu, Peng Wang, Cui Ni, Qingling Zhang

{"title":"Proformer: a scalable graph transformer with linear complexity","authors":"Zhu Liu, Peng Wang, Cui Ni, Qingling Zhang","doi":"10.1007/s10489-024-06065-x","DOIUrl":null,"url":null,"abstract":"<div><p>Since existing GNN methods use a fixed input graph structure for messages passing, they cannot solve the problems of heterogeneity, over-squashing, long-range dependencies, and graph incompleteness. The all-pair message passing scheme is an effective means to address the above issues. However, owing to the quadratic complexity problem of self-attention used in the all-pair message passing scheme, it is not possible to simultaneously guarantee the scalability and accuracy of the algorithm on large-scale graph datasets. In this paper, we propose Proformer, which uses multilayer dilation convolution to project the key and value in self-attention and uses a focused function to further enhance the model representation and reduce the computational complexity of the all-pair message passing scheme from quadratic to linear. The experimental results show that Proformer performs very well in tasks such as nodes, images, and text. Additionally, when scaled to large-scale graph datasets, it is able to effectively reduce the inference time and GPU memory utilization while guaranteeing the algorithm's accuracy. On OGB-Proteins, it not only improves the ROC-AUC by 3.2% but also conserves 27.8% of the GPU memory.</p><h3>Graphical Abstract</h3><div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 2","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-06065-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Since existing GNN methods use a fixed input graph structure for messages passing, they cannot solve the problems of heterogeneity, over-squashing, long-range dependencies, and graph incompleteness. The all-pair message passing scheme is an effective means to address the above issues. However, owing to the quadratic complexity problem of self-attention used in the all-pair message passing scheme, it is not possible to simultaneously guarantee the scalability and accuracy of the algorithm on large-scale graph datasets. In this paper, we propose Proformer, which uses multilayer dilation convolution to project the key and value in self-attention and uses a focused function to further enhance the model representation and reduce the computational complexity of the all-pair message passing scheme from quadratic to linear. The experimental results show that Proformer performs very well in tasks such as nodes, images, and text. Additionally, when scaled to large-scale graph datasets, it is able to effectively reduce the inference time and GPU memory utilization while guaranteeing the algorithm's accuracy. On OGB-Proteins, it not only improves the ROC-AUC by 3.2% but also conserves 27.8% of the GPU memory.

Graphical Abstract

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.