Collective Entity Disambiguation Based on Hierarchical Semantic Similarity

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

International Journal of Data Warehousing and Mining Pub Date : 2020-04-01 DOI:10.4018/ijdwm.2020040101

Bingjing Jia, Hu Yang, Bin Wu, Ying Xing

{"title":"Collective Entity Disambiguation Based on Hierarchical Semantic Similarity","authors":"Bingjing Jia, Hu Yang, Bin Wu, Ying Xing","doi":"10.4018/ijdwm.2020040101","DOIUrl":null,"url":null,"abstract":"Entity disambiguation involves mapping mentions in texts to the corresponding entities in a given knowledge base. Most previous approaches were based on handcrafted features and failed to capture semantic information over multiple granularities. For accurately disambiguating entities, various information aspects of mentions and entities should be used in. This article proposes a hierarchical semantic similarity model to find important clues related to mentions and entities based on multiple sources of information, such as contexts of the mentions, entity descriptions and categories. This model can effectively measure the semantic matching between mentions and target entities. Global features are also added, including prior popularity and global coherence, to improve the performance. In order to verify the effect of hierarchical semantic similarity model combined with global features, named HSSMGF, experiments were carried out on five publicly available benchmark datasets. Results demonstrate the proposed method is very effective in the case that documents have more mentions.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"65 1","pages":"1-17"},"PeriodicalIF":0.7000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Warehousing and Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.4018/ijdwm.2020040101","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 2

Abstract

Entity disambiguation involves mapping mentions in texts to the corresponding entities in a given knowledge base. Most previous approaches were based on handcrafted features and failed to capture semantic information over multiple granularities. For accurately disambiguating entities, various information aspects of mentions and entities should be used in. This article proposes a hierarchical semantic similarity model to find important clues related to mentions and entities based on multiple sources of information, such as contexts of the mentions, entity descriptions and categories. This model can effectively measure the semantic matching between mentions and target entities. Global features are also added, including prior popularity and global coherence, to improve the performance. In order to verify the effect of hierarchical semantic similarity model combined with global features, named HSSMGF, experiments were carried out on five publicly available benchmark datasets. Results demonstrate the proposed method is very effective in the case that documents have more mentions.

查看原文本刊更多论文

基于层次语义相似度的集体实体消歧

实体消歧涉及将文本中的提及映射到给定知识库中的相应实体。以前的大多数方法都是基于手工制作的特征，无法捕获多粒度的语义信息。为了准确地消除实体的歧义，应该在中使用提及和实体的各种信息方面。本文提出了一种基于多信息源(如提及上下文、实体描述和类别)的分层语义相似度模型，以寻找与提及和实体相关的重要线索。该模型可以有效地度量提及与目标实体之间的语义匹配。还添加了全局特征，包括先验流行度和全局一致性，以提高性能。为了验证结合全局特征的层次语义相似度模型HSSMGF的效果，在5个公开的基准数据集上进行了实验。结果表明，该方法在文献提及数较多的情况下是非常有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Data Warehousing and Mining COMPUTER SCIENCE, SOFTWARE ENGINEERING-

CiteScore

2.40

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： The International Journal of Data Warehousing and Mining (IJDWM) disseminates the latest international research findings in the areas of data management and analyzation. IJDWM provides a forum for state-of-the-art developments and research, as well as current innovative activities focusing on the integration between the fields of data warehousing and data mining. Emphasizing applicability to real world problems, this journal meets the needs of both academic researchers and practicing IT professionals.The journal is devoted to the publications of high quality papers on theoretical developments and practical applications in data warehousing and data mining. Original research papers, state-of-the-art reviews, and technical notes are invited for publications. The journal accepts paper submission of any work relevant to data warehousing and data mining. Special attention will be given to papers focusing on mining of data from data warehouses; integration of databases, data warehousing, and data mining; and holistic approaches to mining and archiving