基于关系聚类的动态知识图并行空间构建与嵌入

IF 5.7 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Big Data Pub Date : 2025-01-08 DOI:10.1109/TBDATA.2025.3527238

Yao Liu;Yongfei Zhang

{"title":"基于关系聚类的动态知识图并行空间构建与嵌入","authors":"Yao Liu;Yongfei Zhang","doi":"10.1109/TBDATA.2025.3527238","DOIUrl":null,"url":null,"abstract":"With the increasing amount of data in various domains, knowledge graphs (KGs) have become powerful tools for representing complex and heterogeneous information in a structured way, and for extracting valuable information from knowledge graphs through embedding techniques to support downstream tasks such as recommendation and Q&A systems. Knowledge graphs consist of triples that are continuously added as knowledge is updated. However, most existing embedding models are designed for static graphs, requiring the entire model to be retrained for each update, which is time-consuming. Existing global dynamic embedding models focus on exploiting the structural and relational information of the whole graph to achieve embedding quality, resulting in reduced dynamic efficiency. To address this problem, we propose a relational clustering-based parallel space model in which knowledge from different domains is embedded in different subspaces, allowing each subspace to focus on the data characteristics of a specific domain, thereby improving the quality of knowledge. Second, the new data only affects some subspaces but not the performance of other spaces, improving the model's adaptability to dynamics. Furthermore, we employ two incremental approaches based on the type of added data to improve the efficiency of dynamic embedding while ensuring that the added data preserves the characteristics of the parallel space. The experimental results show that the dynamic embedding efficiency of our model is improved by an average of 50.3% compared to the SOTA dynamic embedding model for the link prediction task. Particularly on FB15K, our model not only improves the efficiency by 41% but also increases the accuracy by 7.5%, demonstrating the accuracy and efficiency of our model.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2308-2320"},"PeriodicalIF":5.7000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Relational Clustering-Based Parallel Spaces Construction and Embedding for Dynamic Knowledge Graph\",\"authors\":\"Yao Liu;Yongfei Zhang\",\"doi\":\"10.1109/TBDATA.2025.3527238\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the increasing amount of data in various domains, knowledge graphs (KGs) have become powerful tools for representing complex and heterogeneous information in a structured way, and for extracting valuable information from knowledge graphs through embedding techniques to support downstream tasks such as recommendation and Q&A systems. Knowledge graphs consist of triples that are continuously added as knowledge is updated. However, most existing embedding models are designed for static graphs, requiring the entire model to be retrained for each update, which is time-consuming. Existing global dynamic embedding models focus on exploiting the structural and relational information of the whole graph to achieve embedding quality, resulting in reduced dynamic efficiency. To address this problem, we propose a relational clustering-based parallel space model in which knowledge from different domains is embedded in different subspaces, allowing each subspace to focus on the data characteristics of a specific domain, thereby improving the quality of knowledge. Second, the new data only affects some subspaces but not the performance of other spaces, improving the model's adaptability to dynamics. Furthermore, we employ two incremental approaches based on the type of added data to improve the efficiency of dynamic embedding while ensuring that the added data preserves the characteristics of the parallel space. The experimental results show that the dynamic embedding efficiency of our model is improved by an average of 50.3% compared to the SOTA dynamic embedding model for the link prediction task. Particularly on FB15K, our model not only improves the efficiency by 41% but also increases the accuracy by 7.5%, demonstrating the accuracy and efficiency of our model.\",\"PeriodicalId\":13106,\"journal\":{\"name\":\"IEEE Transactions on Big Data\",\"volume\":\"11 5\",\"pages\":\"2308-2320\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10833775/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10833775/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

随着各领域数据量的不断增加，知识图已成为以结构化方式表示复杂和异构信息的强大工具，并通过嵌入技术从知识图中提取有价值的信息，以支持下游任务，如推荐和问答系统。知识图由三元组组成，随着知识的更新而不断添加。然而，大多数现有的嵌入模型都是为静态图设计的，每次更新都需要对整个模型进行重新训练，这非常耗时。现有的全局动态嵌入模型侧重于利用整个图的结构信息和关系信息来实现嵌入质量，导致动态效率降低。为了解决这一问题，我们提出了一种基于关系聚类的并行空间模型，该模型将不同领域的知识嵌入到不同的子空间中，使每个子空间都能关注特定领域的数据特征，从而提高知识的质量。其次，新数据只影响部分子空间而不影响其他空间的性能，提高了模型的动态适应性。此外，我们采用了两种基于添加数据类型的增量方法，在保证添加数据保持并行空间特征的同时，提高了动态嵌入的效率。实验结果表明，在链路预测任务中，与SOTA动态嵌入模型相比，该模型的动态嵌入效率平均提高了50.3%。特别是在FB15K上，我们的模型不仅提高了41%的效率，而且提高了7.5%的精度，证明了我们的模型的准确性和效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Relational Clustering-Based Parallel Spaces Construction and Embedding for Dynamic Knowledge Graph

With the increasing amount of data in various domains, knowledge graphs (KGs) have become powerful tools for representing complex and heterogeneous information in a structured way, and for extracting valuable information from knowledge graphs through embedding techniques to support downstream tasks such as recommendation and Q&A systems. Knowledge graphs consist of triples that are continuously added as knowledge is updated. However, most existing embedding models are designed for static graphs, requiring the entire model to be retrained for each update, which is time-consuming. Existing global dynamic embedding models focus on exploiting the structural and relational information of the whole graph to achieve embedding quality, resulting in reduced dynamic efficiency. To address this problem, we propose a relational clustering-based parallel space model in which knowledge from different domains is embedded in different subspaces, allowing each subspace to focus on the data characteristics of a specific domain, thereby improving the quality of knowledge. Second, the new data only affects some subspaces but not the performance of other spaces, improving the model's adaptability to dynamics. Furthermore, we employ two incremental approaches based on the type of added data to improve the efficiency of dynamic embedding while ensuring that the added data preserves the characteristics of the parallel space. The experimental results show that the dynamic embedding efficiency of our model is improved by an average of 50.3% compared to the SOTA dynamic embedding model for the link prediction task. Particularly on FB15K, our model not only improves the efficiency by 41% but also increases the accuracy by 7.5%, demonstrating the accuracy and efficiency of our model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.