通过预训练和归纳推理实现更快的图谱划分

Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai
{"title":"通过预训练和归纳推理实现更快的图谱划分","authors":"Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai","doi":"arxiv-2409.00670","DOIUrl":null,"url":null,"abstract":"Graph partitioning (GP) is a classic problem that divides the node set of a\ngraph into densely-connected blocks. Following the IEEE HPEC Graph Challenge\nand recent advances in pre-training techniques (e.g., large-language models),\nwe propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel\npre-training & refinement paradigm. We first conduct the offline pre-training\nof a deep graph learning (DGL) model on small synthetic graphs with various\ntopology properties. By using the inductive inference of DGL, one can directly\ngeneralize the pre-trained model (with frozen model parameters) to large graphs\nand derive feasible GP results. We also use the derived partition as a good\ninitialization of an efficient GP method (e.g., InfoMap) to further refine the\nquality of partitioning. In this setting, the online generalization and\nrefinement of PR-GPT can not only benefit from the transfer ability regarding\nquality but also ensure high inference efficiency without re-training. Based on\na mechanism of reducing the scale of a graph to be processed by the refinement\nmethod, PR-GPT also has the potential to support streaming GP. Experiments on\nthe Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on\nlarge-scale graphs without significant quality degradation, compared with\nrunning a refinement method from scratch. We will make our code public at\nhttps://github.com/KuroginQin/PRGPT.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards Faster Graph Partitioning via Pre-training and Inductive Inference\",\"authors\":\"Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai\",\"doi\":\"arxiv-2409.00670\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph partitioning (GP) is a classic problem that divides the node set of a\\ngraph into densely-connected blocks. Following the IEEE HPEC Graph Challenge\\nand recent advances in pre-training techniques (e.g., large-language models),\\nwe propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel\\npre-training & refinement paradigm. We first conduct the offline pre-training\\nof a deep graph learning (DGL) model on small synthetic graphs with various\\ntopology properties. By using the inductive inference of DGL, one can directly\\ngeneralize the pre-trained model (with frozen model parameters) to large graphs\\nand derive feasible GP results. We also use the derived partition as a good\\ninitialization of an efficient GP method (e.g., InfoMap) to further refine the\\nquality of partitioning. In this setting, the online generalization and\\nrefinement of PR-GPT can not only benefit from the transfer ability regarding\\nquality but also ensure high inference efficiency without re-training. Based on\\na mechanism of reducing the scale of a graph to be processed by the refinement\\nmethod, PR-GPT also has the potential to support streaming GP. Experiments on\\nthe Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on\\nlarge-scale graphs without significant quality degradation, compared with\\nrunning a refinement method from scratch. We will make our code public at\\nhttps://github.com/KuroginQin/PRGPT.\",\"PeriodicalId\":501032,\"journal\":{\"name\":\"arXiv - CS - Social and Information Networks\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Social and Information Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.00670\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

图分割(GP)是一个经典问题,它将图的节点集分割成密集连接的块。继 IEEE HPEC Graph Challenge 和预训练技术(如大型语言模型)的最新进展之后,我们提出了基于新颖的预训练和精炼范式的 PR-GPT(Pre-trained & Refined Graph ParTitioning)。我们首先在具有不同拓扑特性的小型合成图上对深度图学习(DGL)模型进行离线预训练。通过使用 DGL 的归纳推理,我们可以直接将预训练模型(模型参数冻结)推广到大型图,并得出可行的 GP 结果。我们还将得出的分区作为高效 GP 方法(如 InfoMap)的良好初始化,以进一步完善分区的质量。在这种情况下,PR-GPT 的在线泛化和细化不仅能从质量转移能力中获益,还能在无需重新训练的情况下确保较高的推理效率。PR-GPT 的机制是缩小待处理图的规模,在此基础上,PR-GPT 还具有支持流式 GP 的潜力。在 Graph Challenge 基准上的实验表明,与从头开始运行细化方法相比,PR-GPT 可以确保在大规模图上更快地实现 GP,而不会出现明显的质量下降。我们将在 https://github.com/KuroginQin/PRGPT 公开我们的代码。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards Faster Graph Partitioning via Pre-training and Inductive Inference
Graph partitioning (GP) is a classic problem that divides the node set of a graph into densely-connected blocks. Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm. We first conduct the offline pre-training of a deep graph learning (DGL) model on small synthetic graphs with various topology properties. By using the inductive inference of DGL, one can directly generalize the pre-trained model (with frozen model parameters) to large graphs and derive feasible GP results. We also use the derived partition as a good initialization of an efficient GP method (e.g., InfoMap) to further refine the quality of partitioning. In this setting, the online generalization and refinement of PR-GPT can not only benefit from the transfer ability regarding quality but also ensure high inference efficiency without re-training. Based on a mechanism of reducing the scale of a graph to be processed by the refinement method, PR-GPT also has the potential to support streaming GP. Experiments on the Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on large-scale graphs without significant quality degradation, compared with running a refinement method from scratch. We will make our code public at https://github.com/KuroginQin/PRGPT.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信