知识图嵌入中负抽样的理解

International journal of artificial intelligence & applications Pub Date : 2021-01-31 DOI:10.5121/IJAIA.2021.12105

Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue

{"title":"知识图嵌入中负抽样的理解","authors":"Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue","doi":"10.5121/IJAIA.2021.12105","DOIUrl":null,"url":null,"abstract":"Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"71-81"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Understanding Negative Sampling in Knowledge Graph Embedding\",\"authors\":\"Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue\",\"doi\":\"10.5121/IJAIA.2021.12105\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.\",\"PeriodicalId\":93188,\"journal\":{\"name\":\"International journal of artificial intelligence & applications\",\"volume\":\"12 1\",\"pages\":\"71-81\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of artificial intelligence & applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5121/IJAIA.2021.12105\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of artificial intelligence & applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/IJAIA.2021.12105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

知识图嵌入(Knowledge graph embedding, KGE)是将知识图中的实体和关系投影到低维向量空间中，近年来取得了稳步发展。传统的KGE方法，特别是基于平移距离的模型，是通过区分阳性样本和阴性样本来训练的。为了节省空间，大多数kg只存储阳性样本。因此，负采样在编码KG的三元组中起着至关重要的作用。生成负样本的质量直接影响学习到的知识表示在无数下游任务中的表现，如推荐、链接预测和节点分类。我们将目前KGE的负抽样方法分为三类，分别是基于静态分布的、基于动态分布的和基于自定义聚类的。基于这种分类，我们讨论了最普遍的现有方法及其特点。希望本文能对KGE负抽样的新思路提供一些指导。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Understanding Negative Sampling in Knowledge Graph Embedding

Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International journal of artificial intelligence & applications

自引率

0.00%

发文量