面向信息搜索的语义标注方法

Res. Comput. Sci. Pub Date : 2019-12-31 DOI:10.13053/rcs-148-11-5

Fernando Pech-May, Alicia Martínez Rebollar, Jorge Magaña-Govea, Luis Antonio López Gómez, Edna M. Mil-Chontal

{"title":"面向信息搜索的语义标注方法","authors":"Fernando Pech-May, Alicia Martínez Rebollar, Jorge Magaña-Govea, Luis Antonio López Gómez, Edna M. Mil-Chontal","doi":"10.13053/rcs-148-11-5","DOIUrl":null,"url":null,"abstract":"Due to the needs to improve the information search process, new strategies have been created to enhance searches. The semantic search performs the search by means of meaning instead of literals. The semantic search in unstructured documents requires to formalize knowledge through an annotation semantic process. Some annotation proposals use natural language processing tools, ontologies to link document terms; others use the similarity of entities through the weight of the edges, association between pair of concepts or the ontology structure. In this paper we present an alternative for semantic annotation in unstructured documents by semantic context extraction of entities. In the approach we detect the named entities through a data dictionary created from Wikipedia and link the instances in the ontology. The context extraction strategy is based on the concepts similarity; each term is associated with an instance of the ontology and the similarity between relationships explicit is measured by the combination of two types of measures: the association between each pair of concepts and the weight of the relationships. The approach was tested with two ontologies and two datasets in news and business, respectively.","PeriodicalId":220522,"journal":{"name":"Res. Comput. Sci.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Semantic Annotation Approach for Information Search\",\"authors\":\"Fernando Pech-May, Alicia Martínez Rebollar, Jorge Magaña-Govea, Luis Antonio López Gómez, Edna M. Mil-Chontal\",\"doi\":\"10.13053/rcs-148-11-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the needs to improve the information search process, new strategies have been created to enhance searches. The semantic search performs the search by means of meaning instead of literals. The semantic search in unstructured documents requires to formalize knowledge through an annotation semantic process. Some annotation proposals use natural language processing tools, ontologies to link document terms; others use the similarity of entities through the weight of the edges, association between pair of concepts or the ontology structure. In this paper we present an alternative for semantic annotation in unstructured documents by semantic context extraction of entities. In the approach we detect the named entities through a data dictionary created from Wikipedia and link the instances in the ontology. The context extraction strategy is based on the concepts similarity; each term is associated with an instance of the ontology and the similarity between relationships explicit is measured by the combination of two types of measures: the association between each pair of concepts and the weight of the relationships. The approach was tested with two ontologies and two datasets in news and business, respectively.\",\"PeriodicalId\":220522,\"journal\":{\"name\":\"Res. Comput. Sci.\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Res. Comput. Sci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.13053/rcs-148-11-5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Res. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13053/rcs-148-11-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

由于需要改进信息搜索过程，已经创建了新的策略来增强搜索。语义搜索通过意义而不是字面量来执行搜索。非结构化文档中的语义搜索需要通过标注语义过程将知识形式化。一些注释建议使用自然语言处理工具、本体来链接文档术语;其他方法通过边的权重、概念对之间的关联或本体结构来使用实体的相似性。本文提出了一种基于实体语义上下文提取的非结构化文档语义标注方法。在该方法中，我们通过从维基百科创建的数据字典检测命名实体，并链接本体中的实例。基于概念相似度的上下文抽取策略;每个术语都与本体的一个实例相关联，并且显式关系之间的相似性是通过两种度量的组合来度量的:每对概念之间的关联和关系的权重。该方法分别在新闻和商业的两个本体和两个数据集上进行了测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Semantic Annotation Approach for Information Search

Due to the needs to improve the information search process, new strategies have been created to enhance searches. The semantic search performs the search by means of meaning instead of literals. The semantic search in unstructured documents requires to formalize knowledge through an annotation semantic process. Some annotation proposals use natural language processing tools, ontologies to link document terms; others use the similarity of entities through the weight of the edges, association between pair of concepts or the ontology structure. In this paper we present an alternative for semantic annotation in unstructured documents by semantic context extraction of entities. In the approach we detect the named entities through a data dictionary created from Wikipedia and link the instances in the ontology. The context extraction strategy is based on the concepts similarity; each term is associated with an instance of the ontology and the similarity between relationships explicit is measured by the combination of two types of measures: the association between each pair of concepts and the weight of the relationships. The approach was tested with two ontologies and two datasets in news and business, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Res. Comput. Sci.

自引率

0.00%

发文量