Diange Zhou , Shengwen Li , Lijun Dong , Renyao Chen , Xiaoyue Peng , Hong Yao
{"title":"C-KGE: Curriculum learning-based Knowledge Graph Embedding","authors":"Diange Zhou , Shengwen Li , Lijun Dong , Renyao Chen , Xiaoyue Peng , Hong Yao","doi":"10.1016/j.csl.2024.101689","DOIUrl":null,"url":null,"abstract":"<div><p>Knowledge graph embedding (KGE) aims to embed entities and relations in knowledge graphs (KGs) into a continuous, low-dimensional vector space. It has been shown as an effective tool for integrating knowledge graphs to improve various intelligent applications, such as question answering and information extraction. However, previous KGE models ignore the hidden natural order of knowledge learning on learning the embeddings of entities and relations, leaving room for improvement in their performance. Inspired by the easy-to-hard pattern used in human knowledge learning, this paper proposes a <strong>C</strong>urriculum learning-based <strong>KGE</strong> (C-KGE) model, which learns the embeddings of entities and relations from “basic knowledge” to “domain knowledge”. Specifically, a seed set representing the basic knowledge and several knowledge subsets are identified from KG. Then, entity overlap is employed to score the learning difficulty of each subset. Finally, C-KGE trains the entities and relations in each subset according to the learning difficulty score of each subset. C-KGE leverages trained embeddings of the seed set as prior knowledge and learns knowledge subsets iteratively to transfer knowledge between the seed set and subsets, smoothing the learning process of knowledge facts. Experimental results on real-world datasets demonstrate that the proposed model achieves improved embedding performances as well as reducing training time. Our codes and data will be released later.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S088523082400072X/pdfft?md5=fb33df044eeec38fa247696a89eb8787&pid=1-s2.0-S088523082400072X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S088523082400072X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Knowledge graph embedding (KGE) aims to embed entities and relations in knowledge graphs (KGs) into a continuous, low-dimensional vector space. It has been shown as an effective tool for integrating knowledge graphs to improve various intelligent applications, such as question answering and information extraction. However, previous KGE models ignore the hidden natural order of knowledge learning on learning the embeddings of entities and relations, leaving room for improvement in their performance. Inspired by the easy-to-hard pattern used in human knowledge learning, this paper proposes a Curriculum learning-based KGE (C-KGE) model, which learns the embeddings of entities and relations from “basic knowledge” to “domain knowledge”. Specifically, a seed set representing the basic knowledge and several knowledge subsets are identified from KG. Then, entity overlap is employed to score the learning difficulty of each subset. Finally, C-KGE trains the entities and relations in each subset according to the learning difficulty score of each subset. C-KGE leverages trained embeddings of the seed set as prior knowledge and learns knowledge subsets iteratively to transfer knowledge between the seed set and subsets, smoothing the learning process of knowledge facts. Experimental results on real-world datasets demonstrate that the proposed model achieves improved embedding performances as well as reducing training time. Our codes and data will be released later.
期刊介绍:
Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language.
The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.