Multiview learning of homogeneous neighborhood of nodes for the node representation of heterogeneous graph

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2023-08-05 DOI:10.1007/s10489-023-04907-8

Dongjie Li, Dong Li, Hao Liu

{"title":"Multiview learning of homogeneous neighborhood of nodes for the node representation of heterogeneous graph","authors":"Dongjie Li, Dong Li, Hao Liu","doi":"10.1007/s10489-023-04907-8","DOIUrl":null,"url":null,"abstract":"<div><p>Multiview learning has caught the interest of many graph researchers because it can learn richer information about graphs from different views. Recently, multiview learning, as a novel paradigm in learning, has been widely applied to learn nodes representation of heterogeneous graphs, such as MVSE, HeMI, etc., they only utilize the local homogeneous neighborhood information of nodes, which degrades the quality of nodes representation. We are aware that the heterogeneous graph representation aims to drive the representation of a node to be near the homogeneous neighbors that are similar to it in the heterogeneous graph and far wary from heterogeneous neighbors. Besides, in the heterogeneous graph, linked nodes are more likely to be dissimilar, but remote nodes may have some similarities. Therefore, we can move the locality of a node to discover more homogenous neighbors’ information to improve the quality of node representation. In this work, we propose an unsupervised heterogeneous graph embedding technique that is simple yet efficient; and devise a systematic way to learn node embeddings from the local and global views of the homogeneous neighborhood of nodes by introducing a regularization framework that minimizes the disagreements among the local and global node embeddings under the specific meta-path. Inspired by Personal PageRank graph diffusion, we expand an infinite meta path-based restart random walk to obtain global homogenous neighbors of nodes and construct a meta path-based diffusion matrix to represent the relation between global homogenous neighbors and nodes. Finally, we employ mini-batch gradient descent to train our model to reduce computational consumption. Experimental findings demonstrate that our approach outperforms a wide variety of baselines on different datasets when it comes to node classification and node clustering tasks, with a particularly impressive 7.22% improvement over the best baseline on the ACM dataset.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25184 - 25200"},"PeriodicalIF":3.4000,"publicationDate":"2023-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-023-04907-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Multiview learning has caught the interest of many graph researchers because it can learn richer information about graphs from different views. Recently, multiview learning, as a novel paradigm in learning, has been widely applied to learn nodes representation of heterogeneous graphs, such as MVSE, HeMI, etc., they only utilize the local homogeneous neighborhood information of nodes, which degrades the quality of nodes representation. We are aware that the heterogeneous graph representation aims to drive the representation of a node to be near the homogeneous neighbors that are similar to it in the heterogeneous graph and far wary from heterogeneous neighbors. Besides, in the heterogeneous graph, linked nodes are more likely to be dissimilar, but remote nodes may have some similarities. Therefore, we can move the locality of a node to discover more homogenous neighbors’ information to improve the quality of node representation. In this work, we propose an unsupervised heterogeneous graph embedding technique that is simple yet efficient; and devise a systematic way to learn node embeddings from the local and global views of the homogeneous neighborhood of nodes by introducing a regularization framework that minimizes the disagreements among the local and global node embeddings under the specific meta-path. Inspired by Personal PageRank graph diffusion, we expand an infinite meta path-based restart random walk to obtain global homogenous neighbors of nodes and construct a meta path-based diffusion matrix to represent the relation between global homogenous neighbors and nodes. Finally, we employ mini-batch gradient descent to train our model to reduce computational consumption. Experimental findings demonstrate that our approach outperforms a wide variety of baselines on different datasets when it comes to node classification and node clustering tasks, with a particularly impressive 7.22% improvement over the best baseline on the ACM dataset.

Abstract Image

查看原文本刊更多论文

异构图节点表示的齐次邻域的多视图学习

多视图学习引起了许多图形研究者的兴趣，因为它可以从不同的视图中学习到更丰富的图形信息。近年来，多视点学习作为一种新的学习范式，已被广泛应用于异构图的节点表示，如MVSE、HeMI等，它们只利用节点的局部同构邻域信息，这降低了节点表示的质量。我们知道，异构图表示旨在驱动节点的表示靠近异构图中与其相似的同构邻居，而远离异构邻居。此外，在异构图中，链接节点更有可能是不同的，但远程节点可能有一些相似之处。因此，我们可以移动节点的局部性，以发现更多同质邻居的信息，从而提高节点表示的质量。在这项工作中，我们提出了一种简单而有效的无监督异构图嵌入技术；并通过引入正则化框架来设计一种系统的方法，从节点的同构邻域的局部和全局视图学习节点嵌入，该正则化框架最小化特定元路径下局部和全局节点嵌入之间的分歧。受Personal PageRank图扩散的启发，我们扩展了一个基于无限元路径的重启随机行走来获得节点的全局同质邻居，并构造了一个用于表示全局同质邻居与节点之间关系的基于元路径的扩散矩阵。最后，我们使用小批量梯度下降来训练我们的模型，以减少计算消耗。实验结果表明，在节点分类和节点聚类任务方面，我们的方法在不同的数据集上优于各种基线，比ACM数据集上的最佳基线提高了7.22%，尤其令人印象深刻。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.