学习提取图神经网络

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI:10.1145/3539597.3570480

Cheng Yang, Yuxin Guo, Y. Xu, Chuan Shi, Jiawei Liu, Chuncheng Wang, Xin Li, Ning Guo, Hongzhi Yin

{"title":"学习提取图神经网络","authors":"Cheng Yang, Yuxin Guo, Y. Xu, Chuan Shi, Jiawei Liu, Chuncheng Wang, Xin Li, Ning Guo, Hongzhi Yin","doi":"10.1145/3539597.3570480","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) can effectively capture both the topology and attribute information of a graph, and have been extensively studied in many domains. Recently, there is an emerging trend that equips GNNs with knowledge distillation for better efficiency or effectiveness. However, to the best of our knowledge, existing knowledge distillation methods applied on GNNs all employed predefined distillation processes, which are controlled by several hyper-parameters without any supervision from the performance of distilled models. Such isolation between distillation and evaluation would lead to suboptimal results. In this work, we aim to propose a general knowledge distillation framework that can be applied on any pretrained GNN models to further improve their performance. To address the isolation problem, we propose to parameterize and learn distillation processes suitable for distilling GNNs. Specifically, instead of introducing a unified temperature hyper-parameter as most previous work did, we will learn node-specific distillation temperatures towards better performance of distilled models. We first parameterize each node's temperature by a function of its neighborhood's encodings and predictions, and then design a novel iterative learning process for model distilling and temperature learning. We also introduce a scalable variant of our method to accelerate model training. Experimental results on five benchmark datasets show that our proposed framework can be applied on five popular GNN models and consistently improve their prediction accuracies with 3.12% relative enhancement on average. Besides, the scalable variant enables 8 times faster training speed at the cost of 1% prediction accuracy.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Learning to Distill Graph Neural Networks\",\"authors\":\"Cheng Yang, Yuxin Guo, Y. Xu, Chuan Shi, Jiawei Liu, Chuncheng Wang, Xin Li, Ning Guo, Hongzhi Yin\",\"doi\":\"10.1145/3539597.3570480\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph Neural Networks (GNNs) can effectively capture both the topology and attribute information of a graph, and have been extensively studied in many domains. Recently, there is an emerging trend that equips GNNs with knowledge distillation for better efficiency or effectiveness. However, to the best of our knowledge, existing knowledge distillation methods applied on GNNs all employed predefined distillation processes, which are controlled by several hyper-parameters without any supervision from the performance of distilled models. Such isolation between distillation and evaluation would lead to suboptimal results. In this work, we aim to propose a general knowledge distillation framework that can be applied on any pretrained GNN models to further improve their performance. To address the isolation problem, we propose to parameterize and learn distillation processes suitable for distilling GNNs. Specifically, instead of introducing a unified temperature hyper-parameter as most previous work did, we will learn node-specific distillation temperatures towards better performance of distilled models. We first parameterize each node's temperature by a function of its neighborhood's encodings and predictions, and then design a novel iterative learning process for model distilling and temperature learning. We also introduce a scalable variant of our method to accelerate model training. Experimental results on five benchmark datasets show that our proposed framework can be applied on five popular GNN models and consistently improve their prediction accuracies with 3.12% relative enhancement on average. Besides, the scalable variant enables 8 times faster training speed at the cost of 1% prediction accuracy.\",\"PeriodicalId\":227804,\"journal\":{\"name\":\"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3539597.3570480\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3539597.3570480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

图神经网络(Graph Neural Networks, gnn)能够有效地捕获图的拓扑和属性信息，在许多领域得到了广泛的研究。最近出现了一种新的趋势，即为gnn配备知识蒸馏以提高效率或有效性。然而，据我们所知，用于gnn的现有知识蒸馏方法都采用预定义的蒸馏过程，该过程由几个超参数控制，而不受蒸馏模型性能的监督。蒸馏和评价之间的这种隔离将导致次优结果。在这项工作中，我们的目标是提出一个通用的知识蒸馏框架，可以应用于任何预训练的GNN模型，以进一步提高它们的性能。为了解决分离问题，我们提出参数化和学习适合于gnn蒸馏的蒸馏过程。具体来说，我们将学习节点特定的蒸馏温度，而不是像大多数以前的工作那样引入统一的温度超参数，以提高蒸馏模型的性能。我们首先通过邻域的编码和预测函数来参数化每个节点的温度，然后设计了一种新的迭代学习过程来进行模型提取和温度学习。我们还引入了一种可扩展的方法来加速模型训练。在5个基准数据集上的实验结果表明，我们提出的框架可以应用于5种常用的GNN模型，并持续提高其预测精度，平均相对提高3.12%。此外，可扩展的变体以1%的预测精度为代价，使训练速度提高8倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning to Distill Graph Neural Networks

Graph Neural Networks (GNNs) can effectively capture both the topology and attribute information of a graph, and have been extensively studied in many domains. Recently, there is an emerging trend that equips GNNs with knowledge distillation for better efficiency or effectiveness. However, to the best of our knowledge, existing knowledge distillation methods applied on GNNs all employed predefined distillation processes, which are controlled by several hyper-parameters without any supervision from the performance of distilled models. Such isolation between distillation and evaluation would lead to suboptimal results. In this work, we aim to propose a general knowledge distillation framework that can be applied on any pretrained GNN models to further improve their performance. To address the isolation problem, we propose to parameterize and learn distillation processes suitable for distilling GNNs. Specifically, instead of introducing a unified temperature hyper-parameter as most previous work did, we will learn node-specific distillation temperatures towards better performance of distilled models. We first parameterize each node's temperature by a function of its neighborhood's encodings and predictions, and then design a novel iterative learning process for model distilling and temperature learning. We also introduce a scalable variant of our method to accelerate model training. Experimental results on five benchmark datasets show that our proposed framework can be applied on five popular GNN models and consistently improve their prediction accuracies with 3.12% relative enhancement on average. Besides, the scalable variant enables 8 times faster training speed at the cost of 1% prediction accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量