利用聚类提高处理不完全知识图的链路预测效率

Proceedings of the 2023 12th International Conference on Software and Computer Applications Pub Date : 2023-02-23 DOI:10.1145/3587828.3587830

Fitri Susanti, N. Maulidevi, K. Surendro

{"title":"利用聚类提高处理不完全知识图的链路预测效率","authors":"Fitri Susanti, N. Maulidevi, K. Surendro","doi":"10.1145/3587828.3587830","DOIUrl":null,"url":null,"abstract":"A knowledge graph (KG) is used to store knowledge in the form of connected facts. Facts in KG are represented in the form of a triple (subject, predicate, object) or (head, relation, tail). KG is widely used in question answering, information retrieval, classification, recommender systems, and so on. However, a common problem with KG is incomplete KG. A KG is called incomplete if there is a missing relationship between two entities. An incomplete KG can have an impact on decreasing the accuracy of a task that uses the KG. One solution to the incomplete KG is to use link prediction. Link prediction aims to predict the missing relationship between two entities in a KG. Another problem is that the size of KG is large, consisting of hundreds or millions of entities and relationships. Handling large KG also needs to be considered. Therefore, link prediction on large KG also needs to be considered so that the link prediction process is more efficient. This paper discusses link prediction using embedding to overcome the incomplete KG problem. In addition, it is proposed to use clustering to increase the efficiency of the link prediction process. Clustering is used to group the embedding results. After the embedding results are grouped, scoring and loss function calculations to predict missing links are carried out in groups that are considered appropriate. It is expected that with this grouping, the time of link prediction process can be more efficient because there is no need to check all the vectors in the embedding space.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the Efficiency of Link Prediction on Handling Incomplete Knowledge Graph Using Clustering\",\"authors\":\"Fitri Susanti, N. Maulidevi, K. Surendro\",\"doi\":\"10.1145/3587828.3587830\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A knowledge graph (KG) is used to store knowledge in the form of connected facts. Facts in KG are represented in the form of a triple (subject, predicate, object) or (head, relation, tail). KG is widely used in question answering, information retrieval, classification, recommender systems, and so on. However, a common problem with KG is incomplete KG. A KG is called incomplete if there is a missing relationship between two entities. An incomplete KG can have an impact on decreasing the accuracy of a task that uses the KG. One solution to the incomplete KG is to use link prediction. Link prediction aims to predict the missing relationship between two entities in a KG. Another problem is that the size of KG is large, consisting of hundreds or millions of entities and relationships. Handling large KG also needs to be considered. Therefore, link prediction on large KG also needs to be considered so that the link prediction process is more efficient. This paper discusses link prediction using embedding to overcome the incomplete KG problem. In addition, it is proposed to use clustering to increase the efficiency of the link prediction process. Clustering is used to group the embedding results. After the embedding results are grouped, scoring and loss function calculations to predict missing links are carried out in groups that are considered appropriate. It is expected that with this grouping, the time of link prediction process can be more efficient because there is no need to check all the vectors in the embedding space.\",\"PeriodicalId\":340917,\"journal\":{\"name\":\"Proceedings of the 2023 12th International Conference on Software and Computer Applications\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 12th International Conference on Software and Computer Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3587828.3587830\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3587828.3587830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

知识图(KG)是一种以关联事实的形式存储知识的方法。KG中的事实以三元(主语、谓语、宾语)或(头、关系、尾)的形式表示。KG被广泛应用于问答、信息检索、分类、推荐系统等领域。然而，KG的一个常见问题是不完全KG。如果两个实体之间缺少关系，则KG称为不完整的。不完整的KG可能会降低使用KG的任务的准确性。不完全KG的一个解决方案是使用链接预测。链接预测旨在预测KG中两个实体之间缺失的关系。另一个问题是KG的大小很大，由数亿个实体和关系组成。处理大KG也需要考虑。因此，还需要考虑大KG上的链路预测，以提高链路预测过程的效率。本文讨论了利用嵌入来克服不完全KG问题的链路预测方法。此外，还提出了利用聚类来提高链路预测过程的效率。采用聚类对嵌入结果进行分组。在对嵌入结果进行分组后，在认为合适的分组中进行评分和损失函数计算以预测缺失链接。由于不需要检查嵌入空间中的所有向量，因此可以期望通过这种分组来提高链路预测过程的时间效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving the Efficiency of Link Prediction on Handling Incomplete Knowledge Graph Using Clustering

A knowledge graph (KG) is used to store knowledge in the form of connected facts. Facts in KG are represented in the form of a triple (subject, predicate, object) or (head, relation, tail). KG is widely used in question answering, information retrieval, classification, recommender systems, and so on. However, a common problem with KG is incomplete KG. A KG is called incomplete if there is a missing relationship between two entities. An incomplete KG can have an impact on decreasing the accuracy of a task that uses the KG. One solution to the incomplete KG is to use link prediction. Link prediction aims to predict the missing relationship between two entities in a KG. Another problem is that the size of KG is large, consisting of hundreds or millions of entities and relationships. Handling large KG also needs to be considered. Therefore, link prediction on large KG also needs to be considered so that the link prediction process is more efficient. This paper discusses link prediction using embedding to overcome the incomplete KG problem. In addition, it is proposed to use clustering to increase the efficiency of the link prediction process. Clustering is used to group the embedding results. After the embedding results are grouped, scoring and loss function calculations to predict missing links are carried out in groups that are considered appropriate. It is expected that with this grouping, the time of link prediction process can be more efficient because there is no need to check all the vectors in the embedding space.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2023 12th International Conference on Software and Computer Applications

自引率

0.00%

发文量