A Pure Transformer Pretraining Framework on Text-attributed Graphs.

Proceedings of machine learning research Pub Date : 2024-11-01

Yu Song, Haitao Mao, Jiachen Xiao, Jingzhe Liu, Zhikai Chen, Wei Jin, Carl Yang, Jiliang Tang, Hui Liu

{"title":"A Pure Transformer Pretraining Framework on Text-attributed Graphs.","authors":"Yu Song, Haitao Mao, Jiachen Xiao, Jingzhe Liu, Zhikai Chen, Wei Jin, Carl Yang, Jiliang Tang, Hui Liu","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges represented by feature heterogeneity and structural heterogeneity. Recent efforts have been made to address feature heterogeneity via Large Language Models (LLMs) on text-attributed graphs (TAGs) by generating fixed-length text representations as node features. These high-quality features reduce the previously critical role of graph structure, resulting in a modest performance gap between Graph Neural Networks (GNNs) and structure-agnostic Multi-Layer Perceptrons (MLPs). Motivated by this, we introduce a feature-centric pretraining perspective by treating graph structure as a prior and leveraging the rich, unified feature space to learn refined interaction patterns that generalizes across graphs. Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walk and employs masked feature reconstruction to capture pairwise proximity in the LLM-unified feature space using a standard Transformer. By utilizing unified text representations rather than varying structures, GSPT alleviates structural heterogeneity and achieves significantly better transferability among graphs within the same domain. Our approach can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets. The source code is publicly available at https://github.com/SongYYYY/GSPT.</p>","PeriodicalId":74504,"journal":{"name":"Proceedings of machine learning research","volume":"269 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416796/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of machine learning research","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Pretraining plays a pivotal role in acquiring generalized knowledge from large-scale data, achieving remarkable successes as evidenced by large models in CV and NLP. However, progress in the graph domain remains limited due to fundamental challenges represented by feature heterogeneity and structural heterogeneity. Recent efforts have been made to address feature heterogeneity via Large Language Models (LLMs) on text-attributed graphs (TAGs) by generating fixed-length text representations as node features. These high-quality features reduce the previously critical role of graph structure, resulting in a modest performance gap between Graph Neural Networks (GNNs) and structure-agnostic Multi-Layer Perceptrons (MLPs). Motivated by this, we introduce a feature-centric pretraining perspective by treating graph structure as a prior and leveraging the rich, unified feature space to learn refined interaction patterns that generalizes across graphs. Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walk and employs masked feature reconstruction to capture pairwise proximity in the LLM-unified feature space using a standard Transformer. By utilizing unified text representations rather than varying structures, GSPT alleviates structural heterogeneity and achieves significantly better transferability among graphs within the same domain. Our approach can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets. The source code is publicly available at https://github.com/SongYYYY/GSPT.

本刊更多论文

文本属性图的纯变压器预训练框架。

预训练在从大规模数据中获取广义知识方面发挥着关键作用，在CV和NLP的大型模型中取得了显著的成功。然而，由于特征异质性和结构异质性所代表的根本性挑战，图域的进展仍然有限。最近，人们通过生成固定长度的文本表示作为节点特征，在文本属性图（tag）上通过大型语言模型（llm）来解决特征异质性问题。这些高质量的特征减少了先前图结构的关键作用，导致图神经网络（gnn）和结构不可知的多层感知器（mlp）之间的性能差距不大。受此启发，我们引入了以特征为中心的预训练视角，将图结构作为先验，并利用丰富、统一的特征空间来学习跨图的精细交互模式。我们的框架，使用Transformer的图序列预训练（GSPT），通过随机漫步对节点上下文进行采样，并使用屏蔽特征重建来使用标准Transformer在llm统一的特征空间中捕获成对接近。通过使用统一的文本表示而不是不同的结构，GSPT减轻了结构的异质性，在同一域内的图之间实现了更好的可移植性。我们的方法可以很容易地适应节点分类和链接预测，在各种数据集上展示了有希望的经验成功。源代码可在https://github.com/SongYYYY/GSPT上公开获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of machine learning research

自引率

0.00%

发文量