基于双边恢复和节点可达性的超图快速聚类

Shuta Ito, Takayasu Fushimi
{"title":"基于双边恢复和节点可达性的超图快速聚类","authors":"Shuta Ito, Takayasu Fushimi","doi":"10.1145/3428757.3429095","DOIUrl":null,"url":null,"abstract":"In recent years, studies on hypergraph that is a generalization of graphs and can represent relationships of two or more nodes, have been actively conducted, however, clustering methods over them have not been established yet. In this study, we propose a fast clustering method where a hypergraph is expanded to a bipartite graph by treating hyperedges as nodes, the relationship between the node and the hyperedge is defined by TFIDF, and the value is treated as the weight of the bipartite edge. Our algorithm can efficiently grasp clusters by restoring bipartite edges in descending order of TFIDF weights and merging nodes reachable along the edges into clusters. Based on the bipartite edge restoration and node reachability, we can realize an efficient and effective clustering for a large-scale hypergraph. Furthermore, by performing our method to hyperedges, one can obtain hard-partitioned hyperedges, and exploiting them, also obtain soft-partitioned nodes. Experimental evaluations using small-scale artificial and large-scale real datasets show that our method outputs more accurate clusters in terms of F1 score between the estimated and the actual clusters, and modularity. In addition, the execution time of our method is significantly faster than the compared existing method. As for the soft-clustering, our method produces results with a more balanced size of clusters.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"229 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Fast Clustering of Hypergraphs Based on Bipartite-Edge Restoration and Node Reachability\",\"authors\":\"Shuta Ito, Takayasu Fushimi\",\"doi\":\"10.1145/3428757.3429095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, studies on hypergraph that is a generalization of graphs and can represent relationships of two or more nodes, have been actively conducted, however, clustering methods over them have not been established yet. In this study, we propose a fast clustering method where a hypergraph is expanded to a bipartite graph by treating hyperedges as nodes, the relationship between the node and the hyperedge is defined by TFIDF, and the value is treated as the weight of the bipartite edge. Our algorithm can efficiently grasp clusters by restoring bipartite edges in descending order of TFIDF weights and merging nodes reachable along the edges into clusters. Based on the bipartite edge restoration and node reachability, we can realize an efficient and effective clustering for a large-scale hypergraph. Furthermore, by performing our method to hyperedges, one can obtain hard-partitioned hyperedges, and exploiting them, also obtain soft-partitioned nodes. Experimental evaluations using small-scale artificial and large-scale real datasets show that our method outputs more accurate clusters in terms of F1 score between the estimated and the actual clusters, and modularity. In addition, the execution time of our method is significantly faster than the compared existing method. As for the soft-clustering, our method produces results with a more balanced size of clusters.\",\"PeriodicalId\":212557,\"journal\":{\"name\":\"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services\",\"volume\":\"229 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3428757.3429095\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3428757.3429095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

超图是图的泛化,可以表示两个或多个节点之间的关系。近年来,超图的研究一直很活跃,但对其聚类的方法尚未建立。在本研究中,我们提出了一种快速聚类方法,通过将超边作为节点将超图扩展为二部图,节点与超边之间的关系由TFIDF定义,其值作为二部边的权值。该算法通过按TFIDF权值降序恢复二部边,并将沿边可达的节点合并到聚类中,从而有效地抓取聚类。基于二部边缘恢复和节点可达性,可以实现大规模超图的高效聚类。此外,通过对超边执行我们的方法,可以获得硬分区的超边,并利用它们获得软分区节点。使用小规模人工数据集和大规模真实数据集进行的实验评估表明,我们的方法在估计的聚类与实际聚类之间的F1分数和模块化方面输出了更准确的聚类。此外,我们的方法的执行时间明显快于现有的比较方法。对于软聚类,我们的方法产生的结果具有更平衡的簇大小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fast Clustering of Hypergraphs Based on Bipartite-Edge Restoration and Node Reachability
In recent years, studies on hypergraph that is a generalization of graphs and can represent relationships of two or more nodes, have been actively conducted, however, clustering methods over them have not been established yet. In this study, we propose a fast clustering method where a hypergraph is expanded to a bipartite graph by treating hyperedges as nodes, the relationship between the node and the hyperedge is defined by TFIDF, and the value is treated as the weight of the bipartite edge. Our algorithm can efficiently grasp clusters by restoring bipartite edges in descending order of TFIDF weights and merging nodes reachable along the edges into clusters. Based on the bipartite edge restoration and node reachability, we can realize an efficient and effective clustering for a large-scale hypergraph. Furthermore, by performing our method to hyperedges, one can obtain hard-partitioned hyperedges, and exploiting them, also obtain soft-partitioned nodes. Experimental evaluations using small-scale artificial and large-scale real datasets show that our method outputs more accurate clusters in terms of F1 score between the estimated and the actual clusters, and modularity. In addition, the execution time of our method is significantly faster than the compared existing method. As for the soft-clustering, our method produces results with a more balanced size of clusters.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信