Knowledge-embedded graph representation learning for document-level relation extraction

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-07-09 DOI:10.1016/j.eswa.2025.128872

Jinglin Liang , Yutao Qin , Shuangping Huang , Yunqing Hu , Xinwu Liu , Huiyuan Zhang , Tianshui Chen

{"title":"Knowledge-embedded graph representation learning for document-level relation extraction","authors":"Jinglin Liang , Yutao Qin , Shuangping Huang , Yunqing Hu , Xinwu Liu , Huiyuan Zhang , Tianshui Chen","doi":"10.1016/j.eswa.2025.128872","DOIUrl":null,"url":null,"abstract":"<div><div>Document-level relation extraction (DocRE) aims to identify the relations between entities in a document, which serves as a fundamental task in natural language processing. In DocRE, there inherently exists strong prior knowledge, for example, individuals and organizations tend to exhibit a “membership” relation rather than a “separation” relation. Leveraging this knowledge can effectively confine the model’s prediction space and highlight potential relations between entities. However, existing DocRE works primarily focus on designing sophisticated models to implicitly encode document features, inadvertently neglecting this informative prior knowledge, which might lead to suboptimal performance. In this work, we assume that the prior knowledge can be effectively represented by statistical co-occurrence correlations between entity types and relations. Based on this premise, we propose a novel algorithm called Knowledge-Embedded Graph Representation Learning (KEGRL), which enhances the representation of entity features through this statistical prior knowledge. Specifically, we calculate the statistical co-occurrence correlations existing between entity types and relations. These correlations are then ingeniously encapsulated within weighted edge-oriented heterogeneous graphs, where nodes correspond to entities and relations. Every entity node is connected to all relation nodes, and their edges symbolize the statistical correlations between entities and relations. Across these graphs, the entity features propagate under the guidance of statistical correlations, during which the statistical knowledge is injected into the entity features to enhance their distinctiveness. Extensive experiments on multiple baseline models and datasets consistently demonstrate that integrating KEGRL significantly enhances the performance of DocRE models. The code is available at <span><span>https://github.com/MrDreamQ/KEGRL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"295 ","pages":"Article 128872"},"PeriodicalIF":7.5000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425024893","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Document-level relation extraction (DocRE) aims to identify the relations between entities in a document, which serves as a fundamental task in natural language processing. In DocRE, there inherently exists strong prior knowledge, for example, individuals and organizations tend to exhibit a “membership” relation rather than a “separation” relation. Leveraging this knowledge can effectively confine the model’s prediction space and highlight potential relations between entities. However, existing DocRE works primarily focus on designing sophisticated models to implicitly encode document features, inadvertently neglecting this informative prior knowledge, which might lead to suboptimal performance. In this work, we assume that the prior knowledge can be effectively represented by statistical co-occurrence correlations between entity types and relations. Based on this premise, we propose a novel algorithm called Knowledge-Embedded Graph Representation Learning (KEGRL), which enhances the representation of entity features through this statistical prior knowledge. Specifically, we calculate the statistical co-occurrence correlations existing between entity types and relations. These correlations are then ingeniously encapsulated within weighted edge-oriented heterogeneous graphs, where nodes correspond to entities and relations. Every entity node is connected to all relation nodes, and their edges symbolize the statistical correlations between entities and relations. Across these graphs, the entity features propagate under the guidance of statistical correlations, during which the statistical knowledge is injected into the entity features to enhance their distinctiveness. Extensive experiments on multiple baseline models and datasets consistently demonstrate that integrating KEGRL significantly enhances the performance of DocRE models. The code is available at https://github.com/MrDreamQ/KEGRL.

查看原文本刊更多论文

面向文档级关系提取的知识嵌入式图表示学习

文档级关系抽取（DocRE）旨在识别文档中实体之间的关系，是自然语言处理的一项基本任务。在DocRE中，固有地存在着较强的先验知识，例如，个人和组织往往表现为“成员”关系而不是“分离”关系。利用这些知识可以有效地限制模型的预测空间，并突出实体之间的潜在关系。然而，现有的DocRE工作主要集中在设计复杂的模型来隐式编码文档特征，无意中忽略了这种信息先验知识，这可能导致性能不佳。在这项工作中，我们假设实体类型和关系之间的统计共现相关性可以有效地表示先验知识。在此前提下，我们提出了一种新的算法——知识嵌入式图表示学习（KEGRL），该算法通过这种统计先验知识来增强实体特征的表示。具体来说，我们计算实体类型和关系之间存在的统计共现相关性。然后将这些关联巧妙地封装在加权的面向边的异构图中，其中节点对应于实体和关系。每个实体节点与所有关系节点相连，它们的边表示实体与关系之间的统计相关性。在这些图中，实体特征在统计相关性的指导下传播，在此过程中，统计知识被注入实体特征以增强其独特性。在多个基线模型和数据集上进行的大量实验一致表明，集成KEGRL显著提高了DocRE模型的性能。代码可在https://github.com/MrDreamQ/KEGRL上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.