{"title":"End-to-End Learning of Graph Similarity","authors":"Zhixin Chen, Mengxiang Lin, Deqing Wang","doi":"10.1109/HPCS48598.2019.9188094","DOIUrl":null,"url":null,"abstract":"Constructing and calculating the metrics of graphs comparison precisely can be expensive due to the prohibitively high time complexity, exponential in some cases. Thus building a learning model to approximate the metrics is expected. In this paper, we convert the computation of graphs similarity/distance into a learning problem and propose an end-to-end GCN(Graph Convolutional Network) based model to calculate the GFD(Graphlet Frequency Distribution) distance of graphs. In this way, the trained model predicts the GFD distance of graphs directly rather than constructs a GFD vector by counting graphlets as in traditional methods. A experimental evaluation is conducted to validate the effectiveness of our model in real-world networks scaled from tens of nodes to thousands of nodes. Our trained model takes $ 480\\times$ less time on average compared with the count-based method in the dataset. The 3-top nearest accuracy reaches 74.6% while the 5-top nearest accuracy reaches 85.2% in the test data.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"284 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS48598.2019.9188094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Constructing and calculating the metrics of graphs comparison precisely can be expensive due to the prohibitively high time complexity, exponential in some cases. Thus building a learning model to approximate the metrics is expected. In this paper, we convert the computation of graphs similarity/distance into a learning problem and propose an end-to-end GCN(Graph Convolutional Network) based model to calculate the GFD(Graphlet Frequency Distribution) distance of graphs. In this way, the trained model predicts the GFD distance of graphs directly rather than constructs a GFD vector by counting graphlets as in traditional methods. A experimental evaluation is conducted to validate the effectiveness of our model in real-world networks scaled from tens of nodes to thousands of nodes. Our trained model takes $ 480\times$ less time on average compared with the count-based method in the dataset. The 3-top nearest accuracy reaches 74.6% while the 5-top nearest accuracy reaches 85.2% in the test data.
精确地构造和计算图比较的度量是非常昂贵的,因为时间复杂度非常高,在某些情况下是指数级的。因此,需要建立一个学习模型来近似度量。本文将图的相似度/距离的计算转化为一个学习问题,提出了一种基于端到端的GCN(图卷积网络)模型来计算图的GFD(Graphlet Frequency Distribution)距离。这样,训练后的模型可以直接预测图的GFD距离,而不是像传统方法那样通过计算graphlet来构建一个GFD向量。实验评估验证了我们的模型在从数十个节点到数千个节点的真实网络中的有效性。与数据集中基于计数的方法相比,我们训练的模型平均花费的时间减少了480美元。在测试数据中,3个最接近的准确率达到74.6%,5个最接近的准确率达到85.2%。