{"title":"Finding additional semantic entity information for search engines","authors":"Jun Hou, R. Nayak, Jinglan Zhang","doi":"10.1145/2407085.2407101","DOIUrl":null,"url":null,"abstract":"Entity-oriented search has become an essential component of modern search engines. It focuses on retrieving a list of entities or information about the specific entities instead of documents. In this paper, we study the problem of finding entity related information, referred to as attribute-value pairs, that play a significant role in searching target entities. We propose a novel decomposition framework combining reduced relations and the discriminative model, Conditional Random Field (CRF), for automatically finding entity-related attribute-value pairs from free text documents. This decomposition framework allows us to locate potential text fragments and identify the hidden semantics, in the form of attribute-value pairs for user queries. Empirical analysis shows that the decomposition framework outperforms pattern-based approaches due to its capability of effective integration of syntactic and semantic features.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Australasian Document Computing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2407085.2407101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Entity-oriented search has become an essential component of modern search engines. It focuses on retrieving a list of entities or information about the specific entities instead of documents. In this paper, we study the problem of finding entity related information, referred to as attribute-value pairs, that play a significant role in searching target entities. We propose a novel decomposition framework combining reduced relations and the discriminative model, Conditional Random Field (CRF), for automatically finding entity-related attribute-value pairs from free text documents. This decomposition framework allows us to locate potential text fragments and identify the hidden semantics, in the form of attribute-value pairs for user queries. Empirical analysis shows that the decomposition framework outperforms pattern-based approaches due to its capability of effective integration of syntactic and semantic features.
面向实体的搜索已经成为现代搜索引擎的重要组成部分。它侧重于检索实体列表或关于特定实体的信息,而不是文档。在本文中,我们研究了实体相关信息的查找问题,即属性值对,它在搜索目标实体中起着重要的作用。本文提出了一种结合约简关系和判别模型的分解框架——条件随机场(Conditional Random Field, CRF),用于从自由文本文档中自动发现实体相关的属性值对。这个分解框架允许我们定位潜在的文本片段,并以用户查询的属性-值对的形式识别隐藏的语义。实证分析表明,该分解框架能够有效地整合句法和语义特征,优于基于模式的分解方法。