C-KGE: Curriculum learning-based Knowledge Graph Embedding

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Speech and Language Pub Date : 2024-07-08 DOI:10.1016/j.csl.2024.101689

Diange Zhou , Shengwen Li , Lijun Dong , Renyao Chen , Xiaoyue Peng , Hong Yao

{"title":"C-KGE: Curriculum learning-based Knowledge Graph Embedding","authors":"Diange Zhou , Shengwen Li , Lijun Dong , Renyao Chen , Xiaoyue Peng , Hong Yao","doi":"10.1016/j.csl.2024.101689","DOIUrl":null,"url":null,"abstract":"<div><p>Knowledge graph embedding (KGE) aims to embed entities and relations in knowledge graphs (KGs) into a continuous, low-dimensional vector space. It has been shown as an effective tool for integrating knowledge graphs to improve various intelligent applications, such as question answering and information extraction. However, previous KGE models ignore the hidden natural order of knowledge learning on learning the embeddings of entities and relations, leaving room for improvement in their performance. Inspired by the easy-to-hard pattern used in human knowledge learning, this paper proposes a <strong>C</strong>urriculum learning-based <strong>KGE</strong> (C-KGE) model, which learns the embeddings of entities and relations from “basic knowledge” to “domain knowledge”. Specifically, a seed set representing the basic knowledge and several knowledge subsets are identified from KG. Then, entity overlap is employed to score the learning difficulty of each subset. Finally, C-KGE trains the entities and relations in each subset according to the learning difficulty score of each subset. C-KGE leverages trained embeddings of the seed set as prior knowledge and learns knowledge subsets iteratively to transfer knowledge between the seed set and subsets, smoothing the learning process of knowledge facts. Experimental results on real-world datasets demonstrate that the proposed model achieves improved embedding performances as well as reducing training time. Our codes and data will be released later.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101689"},"PeriodicalIF":3.1000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S088523082400072X/pdfft?md5=fb33df044eeec38fa247696a89eb8787&pid=1-s2.0-S088523082400072X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S088523082400072X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Knowledge graph embedding (KGE) aims to embed entities and relations in knowledge graphs (KGs) into a continuous, low-dimensional vector space. It has been shown as an effective tool for integrating knowledge graphs to improve various intelligent applications, such as question answering and information extraction. However, previous KGE models ignore the hidden natural order of knowledge learning on learning the embeddings of entities and relations, leaving room for improvement in their performance. Inspired by the easy-to-hard pattern used in human knowledge learning, this paper proposes a Curriculum learning-based KGE (C-KGE) model, which learns the embeddings of entities and relations from “basic knowledge” to “domain knowledge”. Specifically, a seed set representing the basic knowledge and several knowledge subsets are identified from KG. Then, entity overlap is employed to score the learning difficulty of each subset. Finally, C-KGE trains the entities and relations in each subset according to the learning difficulty score of each subset. C-KGE leverages trained embeddings of the seed set as prior knowledge and learns knowledge subsets iteratively to transfer knowledge between the seed set and subsets, smoothing the learning process of knowledge facts. Experimental results on real-world datasets demonstrate that the proposed model achieves improved embedding performances as well as reducing training time. Our codes and data will be released later.

查看原文本刊更多论文

C-KGE：基于课程学习的知识图谱嵌入

知识图谱嵌入（KGE）旨在将知识图谱（KG）中的实体和关系嵌入到一个连续的低维向量空间中。它已被证明是整合知识图谱以改进各种智能应用（如问题解答和信息提取）的有效工具。然而，以往的知识图谱模型在学习实体和关系的嵌入时忽略了知识学习的隐性自然顺序，因此其性能还有待提高。受人类知识学习从易到难模式的启发，本文提出了一种基于课程学习的 KGE（C-KGE）模型，该模型从 "基础知识 "到 "领域知识 "学习实体和关系的嵌入。具体来说，首先从 KGE 中识别出代表基础知识的种子集和若干知识子集。然后，利用实体重叠度对每个子集的学习难度进行评分。最后，C-KGE 根据每个子集中的学习难度评分，训练每个子集中的实体和关系。C-KGE 利用训练好的种子集嵌入作为先验知识，并迭代学习知识子集，在种子集和子集之间传递知识，从而平滑知识事实的学习过程。在实际数据集上的实验结果表明，所提出的模型不仅提高了嵌入性能，而且缩短了训练时间。我们的代码和数据将于稍后发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Speech and Language 工程技术-计算机：人工智能

CiteScore

11.30

自引率

4.70%

发文量

审稿时长

22.9 weeks

期刊介绍： Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.