基于元路径嵌入的异构信息网络聚类

Yongjun Zhang, Xiaoping Yang, Liang Wang
{"title":"基于元路径嵌入的异构信息网络聚类","authors":"Yongjun Zhang, Xiaoping Yang, Liang Wang","doi":"10.1109/ICBK50248.2020.00036","DOIUrl":null,"url":null,"abstract":"A low-dimensional embedding of multiple nodes is very convenient for clustering, which is one of the most fundamental tasks for heterogeneous information networks (HINs). On the other hand, the random walk-based network embedding is proved to be equivalent to the method of matrix factorization whose computational cost is very expensive. Moreover, mapping different types of nodes into one metric space may result in incompatibility. To cope with the two challenges above, a meta-path embedding based clustering method (called MPEClus) is proposed in this paper. Firstly, the original network is transformed into several subnetworks with independent semantics specified by meta-paths to solve the incompatibility problem. Secondly, an approximate commute embedding method, bypassing eigen-decomposition to reduce computational cost, is leveraged to the representation learning of the nodes in each subnetwork. At last, a unified probabilistic generation model is designed to aggregate the vectorized representations learned in different metric spaces for clustering. Experiment results show that MPEClus is effective in HIN clustering and outperforms the state-of-the-art baselines on two real-world datasets.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Clustering via Meta-path Embedding for Heterogeneous Information Networks\",\"authors\":\"Yongjun Zhang, Xiaoping Yang, Liang Wang\",\"doi\":\"10.1109/ICBK50248.2020.00036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A low-dimensional embedding of multiple nodes is very convenient for clustering, which is one of the most fundamental tasks for heterogeneous information networks (HINs). On the other hand, the random walk-based network embedding is proved to be equivalent to the method of matrix factorization whose computational cost is very expensive. Moreover, mapping different types of nodes into one metric space may result in incompatibility. To cope with the two challenges above, a meta-path embedding based clustering method (called MPEClus) is proposed in this paper. Firstly, the original network is transformed into several subnetworks with independent semantics specified by meta-paths to solve the incompatibility problem. Secondly, an approximate commute embedding method, bypassing eigen-decomposition to reduce computational cost, is leveraged to the representation learning of the nodes in each subnetwork. At last, a unified probabilistic generation model is designed to aggregate the vectorized representations learned in different metric spaces for clustering. Experiment results show that MPEClus is effective in HIN clustering and outperforms the state-of-the-art baselines on two real-world datasets.\",\"PeriodicalId\":432857,\"journal\":{\"name\":\"2020 IEEE International Conference on Knowledge Graph (ICKG)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Knowledge Graph (ICKG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICBK50248.2020.00036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Knowledge Graph (ICKG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBK50248.2020.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

多节点的低维嵌入为聚类提供了方便,聚类是异构信息网络最基本的任务之一。另一方面,证明了基于随机行走的网络嵌入等价于计算代价非常昂贵的矩阵分解方法。此外,将不同类型的节点映射到一个度量空间可能会导致不兼容。为了解决以上两个问题,本文提出了一种基于元路径嵌入的聚类方法(MPEClus)。首先,将原网络转换为具有独立语义的子网络,通过元路径来解决网络不兼容问题;其次,采用近似通勤嵌入方法,绕过特征分解,减少计算量,实现各子网络节点的表示学习。最后,设计了统一的概率生成模型,将在不同度量空间中学习到的向量化表示进行聚类。实验结果表明,MPEClus在HIN聚类中是有效的,并且在两个真实数据集上优于最先进的基线。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Clustering via Meta-path Embedding for Heterogeneous Information Networks
A low-dimensional embedding of multiple nodes is very convenient for clustering, which is one of the most fundamental tasks for heterogeneous information networks (HINs). On the other hand, the random walk-based network embedding is proved to be equivalent to the method of matrix factorization whose computational cost is very expensive. Moreover, mapping different types of nodes into one metric space may result in incompatibility. To cope with the two challenges above, a meta-path embedding based clustering method (called MPEClus) is proposed in this paper. Firstly, the original network is transformed into several subnetworks with independent semantics specified by meta-paths to solve the incompatibility problem. Secondly, an approximate commute embedding method, bypassing eigen-decomposition to reduce computational cost, is leveraged to the representation learning of the nodes in each subnetwork. At last, a unified probabilistic generation model is designed to aggregate the vectorized representations learned in different metric spaces for clustering. Experiment results show that MPEClus is effective in HIN clustering and outperforms the state-of-the-art baselines on two real-world datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信