通过预训练和归纳推理实现更快的图谱划分

arXiv - CS - Social and Information Networks Pub Date : 2024-09-01 DOI:arxiv-2409.00670

Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai

{"title":"通过预训练和归纳推理实现更快的图谱划分","authors":"Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai","doi":"arxiv-2409.00670","DOIUrl":null,"url":null,"abstract":"Graph partitioning (GP) is a classic problem that divides the node set of a\ngraph into densely-connected blocks. Following the IEEE HPEC Graph Challenge\nand recent advances in pre-training techniques (e.g., large-language models),\nwe propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel\npre-training & refinement paradigm. We first conduct the offline pre-training\nof a deep graph learning (DGL) model on small synthetic graphs with various\ntopology properties. By using the inductive inference of DGL, one can directly\ngeneralize the pre-trained model (with frozen model parameters) to large graphs\nand derive feasible GP results. We also use the derived partition as a good\ninitialization of an efficient GP method (e.g., InfoMap) to further refine the\nquality of partitioning. In this setting, the online generalization and\nrefinement of PR-GPT can not only benefit from the transfer ability regarding\nquality but also ensure high inference efficiency without re-training. Based on\na mechanism of reducing the scale of a graph to be processed by the refinement\nmethod, PR-GPT also has the potential to support streaming GP. Experiments on\nthe Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on\nlarge-scale graphs without significant quality degradation, compared with\nrunning a refinement method from scratch. We will make our code public at\nhttps://github.com/KuroginQin/PRGPT.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards Faster Graph Partitioning via Pre-training and Inductive Inference\",\"authors\":\"Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai\",\"doi\":\"arxiv-2409.00670\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph partitioning (GP) is a classic problem that divides the node set of a\\ngraph into densely-connected blocks. Following the IEEE HPEC Graph Challenge\\nand recent advances in pre-training techniques (e.g., large-language models),\\nwe propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel\\npre-training & refinement paradigm. We first conduct the offline pre-training\\nof a deep graph learning (DGL) model on small synthetic graphs with various\\ntopology properties. By using the inductive inference of DGL, one can directly\\ngeneralize the pre-trained model (with frozen model parameters) to large graphs\\nand derive feasible GP results. We also use the derived partition as a good\\ninitialization of an efficient GP method (e.g., InfoMap) to further refine the\\nquality of partitioning. In this setting, the online generalization and\\nrefinement of PR-GPT can not only benefit from the transfer ability regarding\\nquality but also ensure high inference efficiency without re-training. Based on\\na mechanism of reducing the scale of a graph to be processed by the refinement\\nmethod, PR-GPT also has the potential to support streaming GP. Experiments on\\nthe Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on\\nlarge-scale graphs without significant quality degradation, compared with\\nrunning a refinement method from scratch. We will make our code public at\\nhttps://github.com/KuroginQin/PRGPT.\",\"PeriodicalId\":501032,\"journal\":{\"name\":\"arXiv - CS - Social and Information Networks\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Social and Information Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.00670\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

图分割（GP）是一个经典问题，它将图的节点集分割成密集连接的块。继 IEEE HPEC Graph Challenge 和预训练技术（如大型语言模型）的最新进展之后，我们提出了基于新颖的预训练和精炼范式的 PR-GPT（Pre-trained & Refined Graph ParTitioning）。我们首先在具有不同拓扑特性的小型合成图上对深度图学习（DGL）模型进行离线预训练。通过使用 DGL 的归纳推理，我们可以直接将预训练模型（模型参数冻结）推广到大型图，并得出可行的 GP 结果。我们还将得出的分区作为高效 GP 方法（如 InfoMap）的良好初始化，以进一步完善分区的质量。在这种情况下，PR-GPT 的在线泛化和细化不仅能从质量转移能力中获益，还能在无需重新训练的情况下确保较高的推理效率。PR-GPT 的机制是缩小待处理图的规模，在此基础上，PR-GPT 还具有支持流式 GP 的潜力。在 Graph Challenge 基准上的实验表明，与从头开始运行细化方法相比，PR-GPT 可以确保在大规模图上更快地实现 GP，而不会出现明显的质量下降。我们将在 https://github.com/KuroginQin/PRGPT 公开我们的代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards Faster Graph Partitioning via Pre-training and Inductive Inference

Graph partitioning (GP) is a classic problem that divides the node set of a graph into densely-connected blocks. Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm. We first conduct the offline pre-training of a deep graph learning (DGL) model on small synthetic graphs with various topology properties. By using the inductive inference of DGL, one can directly generalize the pre-trained model (with frozen model parameters) to large graphs and derive feasible GP results. We also use the derived partition as a good initialization of an efficient GP method (e.g., InfoMap) to further refine the quality of partitioning. In this setting, the online generalization and refinement of PR-GPT can not only benefit from the transfer ability regarding quality but also ensure high inference efficiency without re-training. Based on a mechanism of reducing the scale of a graph to be processed by the refinement method, PR-GPT also has the potential to support streaming GP. Experiments on the Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on large-scale graphs without significant quality degradation, compared with running a refinement method from scratch. We will make our code public at https://github.com/KuroginQin/PRGPT.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Social and Information Networks

自引率

0.00%

发文量