Improving the Efficiency of Link Prediction on Handling Incomplete Knowledge Graph Using Clustering

Fitri Susanti, N. Maulidevi, K. Surendro
{"title":"Improving the Efficiency of Link Prediction on Handling Incomplete Knowledge Graph Using Clustering","authors":"Fitri Susanti, N. Maulidevi, K. Surendro","doi":"10.1145/3587828.3587830","DOIUrl":null,"url":null,"abstract":"A knowledge graph (KG) is used to store knowledge in the form of connected facts. Facts in KG are represented in the form of a triple (subject, predicate, object) or (head, relation, tail). KG is widely used in question answering, information retrieval, classification, recommender systems, and so on. However, a common problem with KG is incomplete KG. A KG is called incomplete if there is a missing relationship between two entities. An incomplete KG can have an impact on decreasing the accuracy of a task that uses the KG. One solution to the incomplete KG is to use link prediction. Link prediction aims to predict the missing relationship between two entities in a KG. Another problem is that the size of KG is large, consisting of hundreds or millions of entities and relationships. Handling large KG also needs to be considered. Therefore, link prediction on large KG also needs to be considered so that the link prediction process is more efficient. This paper discusses link prediction using embedding to overcome the incomplete KG problem. In addition, it is proposed to use clustering to increase the efficiency of the link prediction process. Clustering is used to group the embedding results. After the embedding results are grouped, scoring and loss function calculations to predict missing links are carried out in groups that are considered appropriate. It is expected that with this grouping, the time of link prediction process can be more efficient because there is no need to check all the vectors in the embedding space.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3587828.3587830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A knowledge graph (KG) is used to store knowledge in the form of connected facts. Facts in KG are represented in the form of a triple (subject, predicate, object) or (head, relation, tail). KG is widely used in question answering, information retrieval, classification, recommender systems, and so on. However, a common problem with KG is incomplete KG. A KG is called incomplete if there is a missing relationship between two entities. An incomplete KG can have an impact on decreasing the accuracy of a task that uses the KG. One solution to the incomplete KG is to use link prediction. Link prediction aims to predict the missing relationship between two entities in a KG. Another problem is that the size of KG is large, consisting of hundreds or millions of entities and relationships. Handling large KG also needs to be considered. Therefore, link prediction on large KG also needs to be considered so that the link prediction process is more efficient. This paper discusses link prediction using embedding to overcome the incomplete KG problem. In addition, it is proposed to use clustering to increase the efficiency of the link prediction process. Clustering is used to group the embedding results. After the embedding results are grouped, scoring and loss function calculations to predict missing links are carried out in groups that are considered appropriate. It is expected that with this grouping, the time of link prediction process can be more efficient because there is no need to check all the vectors in the embedding space.
利用聚类提高处理不完全知识图的链路预测效率
知识图(KG)是一种以关联事实的形式存储知识的方法。KG中的事实以三元(主语、谓语、宾语)或(头、关系、尾)的形式表示。KG被广泛应用于问答、信息检索、分类、推荐系统等领域。然而,KG的一个常见问题是不完全KG。如果两个实体之间缺少关系,则KG称为不完整的。不完整的KG可能会降低使用KG的任务的准确性。不完全KG的一个解决方案是使用链接预测。链接预测旨在预测KG中两个实体之间缺失的关系。另一个问题是KG的大小很大,由数亿个实体和关系组成。处理大KG也需要考虑。因此,还需要考虑大KG上的链路预测,以提高链路预测过程的效率。本文讨论了利用嵌入来克服不完全KG问题的链路预测方法。此外,还提出了利用聚类来提高链路预测过程的效率。采用聚类对嵌入结果进行分组。在对嵌入结果进行分组后,在认为合适的分组中进行评分和损失函数计算以预测缺失链接。由于不需要检查嵌入空间中的所有向量,因此可以期望通过这种分组来提高链路预测过程的时间效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信