Jinglin Liang , Yutao Qin , Shuangping Huang , Yunqing Hu , Xinwu Liu , Huiyuan Zhang , Tianshui Chen
{"title":"Knowledge-embedded graph representation learning for document-level relation extraction","authors":"Jinglin Liang , Yutao Qin , Shuangping Huang , Yunqing Hu , Xinwu Liu , Huiyuan Zhang , Tianshui Chen","doi":"10.1016/j.eswa.2025.128872","DOIUrl":null,"url":null,"abstract":"<div><div>Document-level relation extraction (DocRE) aims to identify the relations between entities in a document, which serves as a fundamental task in natural language processing. In DocRE, there inherently exists strong prior knowledge, for example, individuals and organizations tend to exhibit a “membership” relation rather than a “separation” relation. Leveraging this knowledge can effectively confine the model’s prediction space and highlight potential relations between entities. However, existing DocRE works primarily focus on designing sophisticated models to implicitly encode document features, inadvertently neglecting this informative prior knowledge, which might lead to suboptimal performance. In this work, we assume that the prior knowledge can be effectively represented by statistical co-occurrence correlations between entity types and relations. Based on this premise, we propose a novel algorithm called Knowledge-Embedded Graph Representation Learning (KEGRL), which enhances the representation of entity features through this statistical prior knowledge. Specifically, we calculate the statistical co-occurrence correlations existing between entity types and relations. These correlations are then ingeniously encapsulated within weighted edge-oriented heterogeneous graphs, where nodes correspond to entities and relations. Every entity node is connected to all relation nodes, and their edges symbolize the statistical correlations between entities and relations. Across these graphs, the entity features propagate under the guidance of statistical correlations, during which the statistical knowledge is injected into the entity features to enhance their distinctiveness. Extensive experiments on multiple baseline models and datasets consistently demonstrate that integrating KEGRL significantly enhances the performance of DocRE models. The code is available at <span><span>https://github.com/MrDreamQ/KEGRL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"295 ","pages":"Article 128872"},"PeriodicalIF":7.5000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425024893","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Document-level relation extraction (DocRE) aims to identify the relations between entities in a document, which serves as a fundamental task in natural language processing. In DocRE, there inherently exists strong prior knowledge, for example, individuals and organizations tend to exhibit a “membership” relation rather than a “separation” relation. Leveraging this knowledge can effectively confine the model’s prediction space and highlight potential relations between entities. However, existing DocRE works primarily focus on designing sophisticated models to implicitly encode document features, inadvertently neglecting this informative prior knowledge, which might lead to suboptimal performance. In this work, we assume that the prior knowledge can be effectively represented by statistical co-occurrence correlations between entity types and relations. Based on this premise, we propose a novel algorithm called Knowledge-Embedded Graph Representation Learning (KEGRL), which enhances the representation of entity features through this statistical prior knowledge. Specifically, we calculate the statistical co-occurrence correlations existing between entity types and relations. These correlations are then ingeniously encapsulated within weighted edge-oriented heterogeneous graphs, where nodes correspond to entities and relations. Every entity node is connected to all relation nodes, and their edges symbolize the statistical correlations between entities and relations. Across these graphs, the entity features propagate under the guidance of statistical correlations, during which the statistical knowledge is injected into the entity features to enhance their distinctiveness. Extensive experiments on multiple baseline models and datasets consistently demonstrate that integrating KEGRL significantly enhances the performance of DocRE models. The code is available at https://github.com/MrDreamQ/KEGRL.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.