基于分类归纳的网络嵌入解释

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2018-07-19 DOI:10.1145/3219819.3220001

Ninghao Liu, Xiao Huang, Jundong Li, Xia Hu

{"title":"基于分类归纳的网络嵌入解释","authors":"Ninghao Liu, Xiao Huang, Jundong Li, Xia Hu","doi":"10.1145/3219819.3220001","DOIUrl":null,"url":null,"abstract":"Network embedding has been increasingly used in many network analytics applications to generate low-dimensional vector representations, so that many off-the-shelf models can be applied to solve a wide variety of data mining tasks. However, similar to many other machine learning methods, network embedding results remain hard to be understood by users. Each dimension in the embedding space usually does not have any specific meaning, thus it is difficult to comprehend how the embedding instances are distributed in the reconstructed space. In addition, heterogeneous content information may be incorporated into network embedding, so it is challenging to specify which source of information is effective in generating the embedding results. In this paper, we investigate the interpretation of network embedding, aiming to understand how instances are distributed in embedding space, as well as explore the factors that lead to the embedding results. We resort to the post-hoc interpretation scheme, so that our approach can be applied to different types of embedding methods. Specifically, the interpretation of network embedding is presented in the form of a taxonomy. Effective objectives and corresponding algorithms are developed towards building the taxonomy. We also design several metrics to evaluate interpretation results. Experiments on real-world datasets from different domains demonstrate that, by comparing with the state-of-the-art alternatives, our approach produces effective and meaningful interpretation to embedding results.","PeriodicalId":322066,"journal":{"name":"Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":"{\"title\":\"On Interpretation of Network Embedding via Taxonomy Induction\",\"authors\":\"Ninghao Liu, Xiao Huang, Jundong Li, Xia Hu\",\"doi\":\"10.1145/3219819.3220001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network embedding has been increasingly used in many network analytics applications to generate low-dimensional vector representations, so that many off-the-shelf models can be applied to solve a wide variety of data mining tasks. However, similar to many other machine learning methods, network embedding results remain hard to be understood by users. Each dimension in the embedding space usually does not have any specific meaning, thus it is difficult to comprehend how the embedding instances are distributed in the reconstructed space. In addition, heterogeneous content information may be incorporated into network embedding, so it is challenging to specify which source of information is effective in generating the embedding results. In this paper, we investigate the interpretation of network embedding, aiming to understand how instances are distributed in embedding space, as well as explore the factors that lead to the embedding results. We resort to the post-hoc interpretation scheme, so that our approach can be applied to different types of embedding methods. Specifically, the interpretation of network embedding is presented in the form of a taxonomy. Effective objectives and corresponding algorithms are developed towards building the taxonomy. We also design several metrics to evaluate interpretation results. Experiments on real-world datasets from different domains demonstrate that, by comparing with the state-of-the-art alternatives, our approach produces effective and meaningful interpretation to embedding results.\",\"PeriodicalId\":322066,\"journal\":{\"name\":\"Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"47\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3219819.3220001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3219819.3220001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 47

摘要

网络嵌入在许多网络分析应用中越来越多地用于生成低维向量表示，因此许多现成的模型可以应用于解决各种数据挖掘任务。然而，与许多其他机器学习方法一样，网络嵌入的结果仍然很难被用户理解。嵌入空间中的每个维度通常没有特定的含义，因此很难理解嵌入实例在重构空间中的分布情况。此外，网络嵌入可能包含异构内容信息，因此很难确定哪种信息源有效地生成嵌入结果。在本文中，我们研究了网络嵌入的解释，旨在了解实例在嵌入空间中的分布情况，并探讨导致嵌入结果的因素。我们采用事后解释方案，使我们的方法可以应用于不同类型的嵌入方法。具体来说，网络嵌入的解释以分类法的形式呈现。为建立分类，提出了有效的目标和相应的算法。我们还设计了几个指标来评估解释结果。对来自不同领域的真实数据集的实验表明，与最先进的替代方法相比，我们的方法对嵌入结果产生了有效和有意义的解释。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On Interpretation of Network Embedding via Taxonomy Induction

Network embedding has been increasingly used in many network analytics applications to generate low-dimensional vector representations, so that many off-the-shelf models can be applied to solve a wide variety of data mining tasks. However, similar to many other machine learning methods, network embedding results remain hard to be understood by users. Each dimension in the embedding space usually does not have any specific meaning, thus it is difficult to comprehend how the embedding instances are distributed in the reconstructed space. In addition, heterogeneous content information may be incorporated into network embedding, so it is challenging to specify which source of information is effective in generating the embedding results. In this paper, we investigate the interpretation of network embedding, aiming to understand how instances are distributed in embedding space, as well as explore the factors that lead to the embedding results. We resort to the post-hoc interpretation scheme, so that our approach can be applied to different types of embedding methods. Specifically, the interpretation of network embedding is presented in the form of a taxonomy. Effective objectives and corresponding algorithms are developed towards building the taxonomy. We also design several metrics to evaluate interpretation results. Experiments on real-world datasets from different domains demonstrate that, by comparing with the state-of-the-art alternatives, our approach produces effective and meaningful interpretation to embedding results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

自引率

0.00%

发文量