基于动态排名分数跟踪的双类型信息网络快速RankCIus算法

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services Pub Date : 2019-12-02 DOI:10.1145/3366030.3366051

Kotaro Yamazaki, Shohei Matsugu, Hiroaki Shiokawa, H. Kitagawa

{"title":"基于动态排名分数跟踪的双类型信息网络快速RankCIus算法","authors":"Kotaro Yamazaki, Shohei Matsugu, Hiroaki Shiokawa, H. Kitagawa","doi":"10.1145/3366030.3366051","DOIUrl":null,"url":null,"abstract":"Given a bi-type information network, which is an extended model of well-known bipartite graphs, how can clusters be efficiently found in graphs? Graph clustering is now a fundamental tool to understand overviews from graph-structured data. The RankClus framework accurately performs clustering for bi-type information networks using ranking-based graph clustering techniques. It integrates a graph ranking algorithms such as PageRank or HITS into graph clustering procedures to improve the clustering quality. However, this integration incurs a high computational cost to handle large bi-type information networks since RankClus repeatedly computes the ranking algorithm for all nodes and edges until the clustering procedure converges. To overcome this runtime limitation, herein we present a novel RankClus algorithm that reduces the running time for large bi-type information networks. Our proposed method employs dynamic graph processing techniques into the ranking procedures included in RankClus. By dynamically updating ranking results, our proposal reduces the number of computed nodes and edges during repeated ranking procedures. We experimentally verify using real-world datasets that our proposed method successfully reduces the running time while maintaining the clustering quality of RankClus.","PeriodicalId":446280,"journal":{"name":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast RankCIus Algorithm via Dynamic Rank Score Tracking on Bi-type Information Networks\",\"authors\":\"Kotaro Yamazaki, Shohei Matsugu, Hiroaki Shiokawa, H. Kitagawa\",\"doi\":\"10.1145/3366030.3366051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given a bi-type information network, which is an extended model of well-known bipartite graphs, how can clusters be efficiently found in graphs? Graph clustering is now a fundamental tool to understand overviews from graph-structured data. The RankClus framework accurately performs clustering for bi-type information networks using ranking-based graph clustering techniques. It integrates a graph ranking algorithms such as PageRank or HITS into graph clustering procedures to improve the clustering quality. However, this integration incurs a high computational cost to handle large bi-type information networks since RankClus repeatedly computes the ranking algorithm for all nodes and edges until the clustering procedure converges. To overcome this runtime limitation, herein we present a novel RankClus algorithm that reduces the running time for large bi-type information networks. Our proposed method employs dynamic graph processing techniques into the ranking procedures included in RankClus. By dynamically updating ranking results, our proposal reduces the number of computed nodes and edges during repeated ranking procedures. We experimentally verify using real-world datasets that our proposed method successfully reduces the running time while maintaining the clustering quality of RankClus.\",\"PeriodicalId\":446280,\"journal\":{\"name\":\"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3366030.3366051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366030.3366051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

给定一个双型信息网络，即众所周知的二部图的扩展模型，如何有效地在图中找到聚类?图聚类现在是理解图结构数据概述的基本工具。RankClus框架使用基于排名的图聚类技术准确地执行双类型信息网络的聚类。它将PageRank或HITS等图排序算法集成到图聚类过程中，以提高聚类质量。然而，由于RankClus反复计算所有节点和边的排序算法，直到聚类过程收敛，因此这种集成在处理大型双类型信息网络时会产生很高的计算成本。为了克服这种运行时间限制，本文提出了一种新的RankClus算法，该算法减少了大型双类型信息网络的运行时间。我们提出的方法将动态图处理技术应用到RankClus中的排序过程中。通过动态更新排序结果，我们的建议减少了在重复排序过程中计算节点和边的数量。我们使用真实的数据集进行实验验证，我们提出的方法成功地减少了运行时间，同时保持了RankClus的聚类质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fast RankCIus Algorithm via Dynamic Rank Score Tracking on Bi-type Information Networks

Given a bi-type information network, which is an extended model of well-known bipartite graphs, how can clusters be efficiently found in graphs? Graph clustering is now a fundamental tool to understand overviews from graph-structured data. The RankClus framework accurately performs clustering for bi-type information networks using ranking-based graph clustering techniques. It integrates a graph ranking algorithms such as PageRank or HITS into graph clustering procedures to improve the clustering quality. However, this integration incurs a high computational cost to handle large bi-type information networks since RankClus repeatedly computes the ranking algorithm for all nodes and edges until the clustering procedure converges. To overcome this runtime limitation, herein we present a novel RankClus algorithm that reduces the running time for large bi-type information networks. Our proposed method employs dynamic graph processing techniques into the ranking procedures included in RankClus. By dynamically updating ranking results, our proposal reduces the number of computed nodes and edges during repeated ranking procedures. We experimentally verify using real-world datasets that our proposed method successfully reduces the running time while maintaining the clustering quality of RankClus.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services

自引率

0.00%

发文量