Haitao He , Haoran Niu , Jianzhou Feng , Junlan Nie , Yangsen Zhang , Jiadong Ren
{"title":"An Embedding Model for Knowledge Graph Completion Based on Graph Sub-Hop Convolutional Network","authors":"Haitao He , Haoran Niu , Jianzhou Feng , Junlan Nie , Yangsen Zhang , Jiadong Ren","doi":"10.1016/j.bdr.2022.100351","DOIUrl":null,"url":null,"abstract":"<div><p>The research on knowledge graph completion based on representation learning<span><span> is increasingly dependent on the node structural feature in the graph. However, a large number of nodes have few immediate neighbors, resulting in the node features unable to be fully expressed. Hence, multi-hop structure features are crucial to the representation learning of nodes. GCN (Graph Convolutional Network) is a graph embedding model that can introduce the multi-hop structure. However, the multi-hop information transmitted between GCN layers suffers a lot of losses. This would lead to the insufficient mining of the node structure features and semantic feature association among entities, further reducing the efficiency of graph knowledge completion. A gate-controlled graph sub-hop </span>convolutional network<span> model for knowledge graph completion is proposed to fill these research gaps. Firstly, a graph sub-hop convolutional network based on matrix representation is designed, which can transmit multi-hop neighbor features directly to the encoded node vector to avoid a large loss of features during multi-hop transmission. On this basis, the implicit multi-hop relations are explicitly embedded into the model based on the TransE. In the process of each hop convolution, aiming at the accumulation of noise redundancy caused by the increase of the receptive field, a sub-hop gate mechanism strategy is proposed to filter information. Finally, the linear model is used to decode the encoded nodes and then complete the knowledge graph. We carried out experimental comparison and analysis on WN18RR, FB15k-237, UMLS, and KINSHIP datasets. The results show that the embedding method based on the sub-hop structural information fusion can greatly improve the results of link prediction.</span></span></p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"30 ","pages":"Article 100351"},"PeriodicalIF":3.5000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Research","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214579622000454","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The research on knowledge graph completion based on representation learning is increasingly dependent on the node structural feature in the graph. However, a large number of nodes have few immediate neighbors, resulting in the node features unable to be fully expressed. Hence, multi-hop structure features are crucial to the representation learning of nodes. GCN (Graph Convolutional Network) is a graph embedding model that can introduce the multi-hop structure. However, the multi-hop information transmitted between GCN layers suffers a lot of losses. This would lead to the insufficient mining of the node structure features and semantic feature association among entities, further reducing the efficiency of graph knowledge completion. A gate-controlled graph sub-hop convolutional network model for knowledge graph completion is proposed to fill these research gaps. Firstly, a graph sub-hop convolutional network based on matrix representation is designed, which can transmit multi-hop neighbor features directly to the encoded node vector to avoid a large loss of features during multi-hop transmission. On this basis, the implicit multi-hop relations are explicitly embedded into the model based on the TransE. In the process of each hop convolution, aiming at the accumulation of noise redundancy caused by the increase of the receptive field, a sub-hop gate mechanism strategy is proposed to filter information. Finally, the linear model is used to decode the encoded nodes and then complete the knowledge graph. We carried out experimental comparison and analysis on WN18RR, FB15k-237, UMLS, and KINSHIP datasets. The results show that the embedding method based on the sub-hop structural information fusion can greatly improve the results of link prediction.
期刊介绍:
The journal aims to promote and communicate advances in big data research by providing a fast and high quality forum for researchers, practitioners and policy makers from the very many different communities working on, and with, this topic.
The journal will accept papers on foundational aspects in dealing with big data, as well as papers on specific Platforms and Technologies used to deal with big data. To promote Data Science and interdisciplinary collaboration between fields, and to showcase the benefits of data driven research, papers demonstrating applications of big data in domains as diverse as Geoscience, Social Web, Finance, e-Commerce, Health Care, Environment and Climate, Physics and Astronomy, Chemistry, life sciences and drug discovery, digital libraries and scientific publications, security and government will also be considered. Occasionally the journal may publish whitepapers on policies, standards and best practices.