SDCG: Silhouette-based Deep Clustering with GNN for Improved Graph Node Clustering

Hyesoo Shin, Eunjo Jang, Sojeong Kim, Ki Yong Lee
{"title":"SDCG: Silhouette-based Deep Clustering with GNN for Improved Graph Node Clustering","authors":"Hyesoo Shin, Eunjo Jang, Sojeong Kim, Ki Yong Lee","doi":"10.1109/SERA57763.2023.10197683","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) are powerful tools for analyzing graph-structured data in various fields because of their great expressive power for graph data. They use a message-passing mechanism to update node embeddings, which are then used for tasks such as node classification and link prediction. Recently, node embeddings have also been used in research on graph node clustering, which aims to group similar nodes based on their features and graph topology. However, traditional methods for node clustering have a limitation in that GNNs only focus on generating node embeddings without considering the ultimate objective of clustering. To address this issue, a novel technique called \"Deep Clustering\" has been proposed, which integrates both node embedding and clustering stages. This requires defining a new loss function by simultaneously minimizing the GNN loss and the clustering loss. Our proposed loss function incorporates not only the distance within clusters but also the distance between clusters by applying the Silhouette coefficient, which enables us to achieve better clustering results. In this paper, we propose a Silhouette-based Deep Clustering with GNN (SDCG) to more effectively cluster nodes in a graph by iteratively training the embedding model to produce embedding vectors with improved clustering results. Through extensive experiments, we demonstrate that SDCG outperforms the conventional approach of performing embedding and clustering independently.","PeriodicalId":211080,"journal":{"name":"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERA57763.2023.10197683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Graph Neural Networks (GNNs) are powerful tools for analyzing graph-structured data in various fields because of their great expressive power for graph data. They use a message-passing mechanism to update node embeddings, which are then used for tasks such as node classification and link prediction. Recently, node embeddings have also been used in research on graph node clustering, which aims to group similar nodes based on their features and graph topology. However, traditional methods for node clustering have a limitation in that GNNs only focus on generating node embeddings without considering the ultimate objective of clustering. To address this issue, a novel technique called "Deep Clustering" has been proposed, which integrates both node embedding and clustering stages. This requires defining a new loss function by simultaneously minimizing the GNN loss and the clustering loss. Our proposed loss function incorporates not only the distance within clusters but also the distance between clusters by applying the Silhouette coefficient, which enables us to achieve better clustering results. In this paper, we propose a Silhouette-based Deep Clustering with GNN (SDCG) to more effectively cluster nodes in a graph by iteratively training the embedding model to produce embedding vectors with improved clustering results. Through extensive experiments, we demonstrate that SDCG outperforms the conventional approach of performing embedding and clustering independently.
SDCG:基于轮廓的深度聚类与改进图节点聚类的GNN
图神经网络(Graph Neural Networks, gnn)对图数据具有很强的表达能力,是分析各种领域图结构数据的有力工具。它们使用消息传递机制来更新节点嵌入,然后将其用于节点分类和链接预测等任务。最近,节点嵌入也被用于图节点聚类的研究,其目的是根据节点的特征和图的拓扑结构对相似的节点进行分组。然而,传统的节点聚类方法存在一个局限性,即gnn只关注生成节点嵌入,而不考虑聚类的最终目的。为了解决这个问题,一种新的技术被称为“深度聚类”,它集成了节点嵌入和聚类两个阶段。这需要定义一个新的损失函数,同时最小化GNN损失和聚类损失。我们提出的损失函数不仅包含聚类内的距离,还包含聚类之间的距离,通过应用Silhouette系数,使我们能够获得更好的聚类结果。在本文中,我们提出了一种基于轮廓的GNN深度聚类(SDCG)方法,通过迭代训练嵌入模型来产生嵌入向量,从而提高聚类结果,从而更有效地聚类图中的节点。通过大量的实验,我们证明SDCG优于传统的独立执行嵌入和聚类的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信