基于语义层的自然语言查询认知规范化

ACM Trans. Speech Lang. Process. Pub Date : 2013-12-01 DOI:10.1145/2539053

S. Roy, W. Zeng

{"title":"基于语义层的自然语言查询认知规范化","authors":"S. Roy, W. Zeng","doi":"10.1145/2539053","DOIUrl":null,"url":null,"abstract":"Natural language search relies strongly on perceiving semantics in a query sentence. Semantics is captured by the relationship among the query words, represented as a network (graph). Such a network of words can be fed into larger ontologies, like DBpedia or Google Knowledge Graph, where they appear as subgraphs— fashioning the name subnetworks (subnets). Thus, subnet is a canonical form for interfacing a natural language query to a graph database and is an integral step for graph-based searching. In this article, we present a novel standalone NLP technique that leverages the cognitive psychology notion of semantic strata for semantic subnetwork extraction from natural language queries. The cognitive model describes some of the fundamental structures employed by the human cognition to construct semantic information in the brain, called semantic strata. We propose a computational model based on conditional random fields to capture the cognitive abstraction provided by semantic strata, facilitating cognitive canonicalization of the query. Our results, conducted on approximately 5000 queries, suggest that the cognitive canonicals based on semantic strata are capable of significantly improving parsing and role labeling performance beyond pure lexical approaches, such as parts-of-speech based techniques. We also find that cognitive canonicalized subnets are more semantically coherent compared to syntax trees when explored in graph ontologies like DBpedia and improve ranking of retrieved documents.","PeriodicalId":412532,"journal":{"name":"ACM Trans. Speech Lang. Process.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Cognitive canonicalization of natural language queries using semantic strata\",\"authors\":\"S. Roy, W. Zeng\",\"doi\":\"10.1145/2539053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural language search relies strongly on perceiving semantics in a query sentence. Semantics is captured by the relationship among the query words, represented as a network (graph). Such a network of words can be fed into larger ontologies, like DBpedia or Google Knowledge Graph, where they appear as subgraphs— fashioning the name subnetworks (subnets). Thus, subnet is a canonical form for interfacing a natural language query to a graph database and is an integral step for graph-based searching. In this article, we present a novel standalone NLP technique that leverages the cognitive psychology notion of semantic strata for semantic subnetwork extraction from natural language queries. The cognitive model describes some of the fundamental structures employed by the human cognition to construct semantic information in the brain, called semantic strata. We propose a computational model based on conditional random fields to capture the cognitive abstraction provided by semantic strata, facilitating cognitive canonicalization of the query. Our results, conducted on approximately 5000 queries, suggest that the cognitive canonicals based on semantic strata are capable of significantly improving parsing and role labeling performance beyond pure lexical approaches, such as parts-of-speech based techniques. We also find that cognitive canonicalized subnets are more semantically coherent compared to syntax trees when explored in graph ontologies like DBpedia and improve ranking of retrieved documents.\",\"PeriodicalId\":412532,\"journal\":{\"name\":\"ACM Trans. Speech Lang. Process.\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Trans. Speech Lang. Process.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2539053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Speech Lang. Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2539053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

自然语言搜索在很大程度上依赖于对查询句子语义的感知。语义由查询词之间的关系捕获，表示为网络(图)。这样的单词网络可以被输入到更大的本体中，如DBpedia或Google Knowledge Graph，在那里它们以子图的形式出现——形成名称子网(subnets)。因此，子网是将自然语言查询连接到图数据库的规范形式，是基于图的搜索的一个不可或缺的步骤。在本文中，我们提出了一种新的独立NLP技术，该技术利用语义层的认知心理学概念从自然语言查询中提取语义子网。认知模型描述了人类认知在大脑中构建语义信息所使用的一些基本结构，称为语义层。我们提出了一种基于条件随机场的计算模型来捕获语义层提供的认知抽象，促进查询的认知规范化。我们对大约5000个查询进行的结果表明，基于语义层的认知规范能够显著提高解析和角色标记性能，超过纯词法方法，如基于词性的技术。我们还发现，当在像DBpedia这样的图本体中探索时，认知规范化子网比语法树在语义上更连贯，并提高了检索文档的排名。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cognitive canonicalization of natural language queries using semantic strata

Natural language search relies strongly on perceiving semantics in a query sentence. Semantics is captured by the relationship among the query words, represented as a network (graph). Such a network of words can be fed into larger ontologies, like DBpedia or Google Knowledge Graph, where they appear as subgraphs— fashioning the name subnetworks (subnets). Thus, subnet is a canonical form for interfacing a natural language query to a graph database and is an integral step for graph-based searching. In this article, we present a novel standalone NLP technique that leverages the cognitive psychology notion of semantic strata for semantic subnetwork extraction from natural language queries. The cognitive model describes some of the fundamental structures employed by the human cognition to construct semantic information in the brain, called semantic strata. We propose a computational model based on conditional random fields to capture the cognitive abstraction provided by semantic strata, facilitating cognitive canonicalization of the query. Our results, conducted on approximately 5000 queries, suggest that the cognitive canonicals based on semantic strata are capable of significantly improving parsing and role labeling performance beyond pure lexical approaches, such as parts-of-speech based techniques. We also find that cognitive canonicalized subnets are more semantically coherent compared to syntax trees when explored in graph ontologies like DBpedia and improve ranking of retrieved documents.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Trans. Speech Lang. Process.

自引率

0.00%

发文量