ProtoMGAE：用于图形表征学习的原型感知掩码图形自动编码器

IF 4 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-02-20 DOI:10.1145/3649143

Yimei Zheng, Caiyan Jia

{"title":"ProtoMGAE：用于图形表征学习的原型感知掩码图形自动编码器","authors":"Yimei Zheng, Caiyan Jia","doi":"10.1145/3649143","DOIUrl":null,"url":null,"abstract":"<p>Graph self-supervised representation learning has gained considerable attention and demonstrated remarkable efficacy in extracting meaningful representations from graphs, particularly in the absence of labeled data. Two representative methods in this domain are graph auto-encoding and graph contrastive learning. However, the former methods primarily focus on global structures, potentially overlooking some fine-grained information during reconstruction. The latter methods emphasize node similarity across correlated views in the embedding space, potentially neglecting the inherent global graph information in the original input space. Moreover, handling incomplete graphs in real-world scenarios, where original features are unavailable for certain nodes, poses challenges for both types of methods. To alleviate these limitations, we integrate masked graph auto-encoding and prototype-aware graph contrastive learning into a unified model to learn node representations in graphs. In our method, we begin by masking a portion of node features and utilize a specific decoding strategy to reconstruct the masked information. This process facilitates the recovery of graphs from a global or macro level and enables handling incomplete graphs easily. Moreover, we treat the masked graph and the original one as a pair of contrasting views, enforcing the alignment and uniformity between their corresponding node representations at a local or micro level. Lastly, to capture cluster structures from a meso level and learn more discriminative representations, we introduce a prototype-aware clustering consistency loss that is jointly optimized with the above two complementary objectives. Extensive experiments conducted on several datasets demonstrate that the proposed method achieves significantly better or competitive performance on downstream tasks, especially for graph clustering, compared with the state-of-the-art methods, showcasing its superiority in enhancing graph representation learning.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"48 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ProtoMGAE: Prototype-aware Masked Graph Auto-Encoder for Graph Representation Learning\",\"authors\":\"Yimei Zheng, Caiyan Jia\",\"doi\":\"10.1145/3649143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Graph self-supervised representation learning has gained considerable attention and demonstrated remarkable efficacy in extracting meaningful representations from graphs, particularly in the absence of labeled data. Two representative methods in this domain are graph auto-encoding and graph contrastive learning. However, the former methods primarily focus on global structures, potentially overlooking some fine-grained information during reconstruction. The latter methods emphasize node similarity across correlated views in the embedding space, potentially neglecting the inherent global graph information in the original input space. Moreover, handling incomplete graphs in real-world scenarios, where original features are unavailable for certain nodes, poses challenges for both types of methods. To alleviate these limitations, we integrate masked graph auto-encoding and prototype-aware graph contrastive learning into a unified model to learn node representations in graphs. In our method, we begin by masking a portion of node features and utilize a specific decoding strategy to reconstruct the masked information. This process facilitates the recovery of graphs from a global or macro level and enables handling incomplete graphs easily. Moreover, we treat the masked graph and the original one as a pair of contrasting views, enforcing the alignment and uniformity between their corresponding node representations at a local or micro level. Lastly, to capture cluster structures from a meso level and learn more discriminative representations, we introduce a prototype-aware clustering consistency loss that is jointly optimized with the above two complementary objectives. Extensive experiments conducted on several datasets demonstrate that the proposed method achieves significantly better or competitive performance on downstream tasks, especially for graph clustering, compared with the state-of-the-art methods, showcasing its superiority in enhancing graph representation learning.</p>\",\"PeriodicalId\":49249,\"journal\":{\"name\":\"ACM Transactions on Knowledge Discovery from Data\",\"volume\":\"48 1\",\"pages\":\"\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Knowledge Discovery from Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3649143\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3649143","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

图自监督表征学习在从图中提取有意义的表征（尤其是在没有标记数据的情况下）方面获得了相当多的关注，并显示出显著的功效。该领域的两种代表性方法是图自动编码和图对比学习。不过，前一种方法主要关注全局结构，在重建过程中可能会忽略一些细粒度信息。后一种方法强调嵌入空间中相关视图的节点相似性，可能会忽略原始输入空间中固有的全局图信息。此外，在现实世界中，某些节点的原始特征不可用，处理这种不完整的图对这两类方法都提出了挑战。为了缓解这些限制，我们将屏蔽图自动编码和原型感知图对比学习整合到一个统一的模型中，以学习图中的节点表征。在我们的方法中，我们首先屏蔽部分节点特征，然后利用特定的解码策略重建屏蔽信息。这一过程有助于从全局或宏观层面恢复图，并能轻松处理不完整的图。此外，我们将掩蔽图和原始图视为一对对比视图，在局部或微观层面上强化了其相应节点表示之间的对齐性和统一性。最后，为了从中观层面捕捉聚类结构并学习更具区分性的表征，我们引入了原型感知聚类一致性损失，该损失与上述两个互补目标共同优化。在多个数据集上进行的广泛实验表明，与最先进的方法相比，所提出的方法在下游任务（尤其是图聚类）上取得了明显更好或更有竞争力的性能，展示了它在增强图表征学习方面的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ProtoMGAE: Prototype-aware Masked Graph Auto-Encoder for Graph Representation Learning

Graph self-supervised representation learning has gained considerable attention and demonstrated remarkable efficacy in extracting meaningful representations from graphs, particularly in the absence of labeled data. Two representative methods in this domain are graph auto-encoding and graph contrastive learning. However, the former methods primarily focus on global structures, potentially overlooking some fine-grained information during reconstruction. The latter methods emphasize node similarity across correlated views in the embedding space, potentially neglecting the inherent global graph information in the original input space. Moreover, handling incomplete graphs in real-world scenarios, where original features are unavailable for certain nodes, poses challenges for both types of methods. To alleviate these limitations, we integrate masked graph auto-encoding and prototype-aware graph contrastive learning into a unified model to learn node representations in graphs. In our method, we begin by masking a portion of node features and utilize a specific decoding strategy to reconstruct the masked information. This process facilitates the recovery of graphs from a global or macro level and enables handling incomplete graphs easily. Moreover, we treat the masked graph and the original one as a pair of contrasting views, enforcing the alignment and uniformity between their corresponding node representations at a local or micro level. Lastly, to capture cluster structures from a meso level and learn more discriminative representations, we introduce a prototype-aware clustering consistency loss that is jointly optimized with the above two complementary objectives. Extensive experiments conducted on several datasets demonstrate that the proposed method achieves significantly better or competitive performance on downstream tasks, especially for graph clustering, compared with the state-of-the-art methods, showcasing its superiority in enhancing graph representation learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Knowledge Discovery from Data COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING

CiteScore

6.70

自引率

5.60%

发文量

172

审稿时长

3 months

期刊介绍： TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.