AdaE: Knowledge Graph Embedding With Adaptive Embedding Sizes

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-01 DOI:10.1109/TKDE.2025.3566270

Zhanpeng Guan;Fuwei Zhang;Zhao Zhang;Fuzhen Zhuang;Fei Wang;Zhulin An;Yongjun Xu

{"title":"AdaE: Knowledge Graph Embedding With Adaptive Embedding Sizes","authors":"Zhanpeng Guan;Fuwei Zhang;Zhao Zhang;Fuzhen Zhuang;Fei Wang;Zhulin An;Yongjun Xu","doi":"10.1109/TKDE.2025.3566270","DOIUrl":null,"url":null,"abstract":"Knowledge Graph Embedding (KGE) aims to learn dense embeddings as the representations for entities and relations in KGs. Indeed, the entities in existing KGs suffer from the data imbalance issue, i.e., there exists a substantial disparity in the occurrence frequencies among various entities. Existing KGE models pre-define a unified and fixed dimension size for all entity embeddings. However, embedding sizes of entities are highly desired for their frequencies, while a uniform embedding size may result in inadequate expression of entities, i.e., leading to overfitting for low-frequency entities and underfitting for high-frequency ones. A straight-forward idea is to set the embedding sizes for each entity before KGE training. However, manually selecting different embedding sizes is labor-intensive and time-consuming, posing challenges in real-world applications. To tackle this problem, we propose AdaE, which adaptively learns KG embeddings with different embedding sizes during training. In particular, AdaE is capable of selecting appropriate dimension sizes for each entity from a continuous integer space. To this end, we specially tailor bilevel optimization for the KGE task, which alternately learns representations and embedding sizes of entities. Our framework is general and flexible, fitting various existing KGE models. Extensive experiments demonstrate the effectiveness and compatibility of AdaE.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 8","pages":"4432-4445"},"PeriodicalIF":10.4000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10981648/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Knowledge Graph Embedding (KGE) aims to learn dense embeddings as the representations for entities and relations in KGs. Indeed, the entities in existing KGs suffer from the data imbalance issue, i.e., there exists a substantial disparity in the occurrence frequencies among various entities. Existing KGE models pre-define a unified and fixed dimension size for all entity embeddings. However, embedding sizes of entities are highly desired for their frequencies, while a uniform embedding size may result in inadequate expression of entities, i.e., leading to overfitting for low-frequency entities and underfitting for high-frequency ones. A straight-forward idea is to set the embedding sizes for each entity before KGE training. However, manually selecting different embedding sizes is labor-intensive and time-consuming, posing challenges in real-world applications. To tackle this problem, we propose AdaE, which adaptively learns KG embeddings with different embedding sizes during training. In particular, AdaE is capable of selecting appropriate dimension sizes for each entity from a continuous integer space. To this end, we specially tailor bilevel optimization for the KGE task, which alternately learns representations and embedding sizes of entities. Our framework is general and flexible, fitting various existing KGE models. Extensive experiments demonstrate the effectiveness and compatibility of AdaE.

查看原文本刊更多论文

自适应嵌入大小的知识图嵌入

知识图谱嵌入（Knowledge Graph Embedding， KGE）旨在学习密集嵌入，作为知识图谱中实体和关系的表示。事实上，现有知识图谱中的实体存在数据不平衡问题，即不同实体之间的出现频率存在很大差异。现有的KGE模型为所有实体嵌入预先定义了统一和固定的尺寸。然而，实体的嵌入尺寸对其频率有很高的要求，而统一的嵌入尺寸可能导致实体的表达不足，即导致低频实体的过拟合和高频实体的欠拟合。一个直接的想法是在KGE训练之前设置每个实体的嵌入大小。然而，手动选择不同的嵌入大小是劳动密集型和耗时的，在实际应用中带来了挑战。为了解决这一问题，我们提出了在训练过程中自适应学习不同嵌入大小的KG嵌入的AdaE算法。特别是，AdaE能够从连续整数空间中为每个实体选择合适的维度大小。为此，我们专门为KGE任务定制了双层优化，该任务交替学习实体的表示和嵌入大小。我们的框架是通用的和灵活的，适合各种现有的KGE模型。大量实验证明了AdaE的有效性和相容性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.