异构链接数据图的自然语言查询:一种分布组合语义方法

Proceedings of the 19th international conference on Intelligent User Interfaces Pub Date : 2014-02-24 DOI:10.1145/2557500.2557534

A. Freitas, E. Curry

{"title":"异构链接数据图的自然语言查询:一种分布组合语义方法","authors":"A. Freitas, E. Curry","doi":"10.1145/2557500.2557534","DOIUrl":null,"url":null,"abstract":"The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying heterogeneous and distributed third-party databases can create major barriers for data consumers. At the core of this problem is the semantic gap between the way users express their information needs and the representation of the data. This work aims to provide a natural language interface and an associated semantic index to support an increased level of vocabulary independency for queries over Linked Data/Semantic Web datasets, using a distributional-compositional semantics approach. Distributional semantics focuses on the automatic construction of a semantic model based on the statistical distribution of co-occurring words in large-scale texts. The proposed query model targets the following features: (i) a principled semantic approximation approach with low adaptation effort (independent from manually created resources such as ontologies, thesauri or dictionaries), (ii) comprehensive semantic matching supported by the inclusion of large volumes of distributional (unstructured) commonsense knowledge into the semantic approximation process and (iii) expressive natural language queries. The approach is evaluated using natural language queries on an open domain dataset and achieved avg. recall=0.81, mean avg. precision=0.62 and mean reciprocal rank=0.49.","PeriodicalId":287073,"journal":{"name":"Proceedings of the 19th international conference on Intelligent User Interfaces","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach\",\"authors\":\"A. Freitas, E. Curry\",\"doi\":\"10.1145/2557500.2557534\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying heterogeneous and distributed third-party databases can create major barriers for data consumers. At the core of this problem is the semantic gap between the way users express their information needs and the representation of the data. This work aims to provide a natural language interface and an associated semantic index to support an increased level of vocabulary independency for queries over Linked Data/Semantic Web datasets, using a distributional-compositional semantics approach. Distributional semantics focuses on the automatic construction of a semantic model based on the statistical distribution of co-occurring words in large-scale texts. The proposed query model targets the following features: (i) a principled semantic approximation approach with low adaptation effort (independent from manually created resources such as ontologies, thesauri or dictionaries), (ii) comprehensive semantic matching supported by the inclusion of large volumes of distributional (unstructured) commonsense knowledge into the semantic approximation process and (iii) expressive natural language queries. The approach is evaluated using natural language queries on an open domain dataset and achieved avg. recall=0.81, mean avg. precision=0.62 and mean reciprocal rank=0.49.\",\"PeriodicalId\":287073,\"journal\":{\"name\":\"Proceedings of the 19th international conference on Intelligent User Interfaces\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th international conference on Intelligent User Interfaces\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2557500.2557534\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th international conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2557500.2557534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 48

摘要

访问大量异构结构化数据的需求正在成为许多用户和应用程序的一种趋势。然而，查询异构和分布式第三方数据库所涉及的工作可能会给数据消费者带来主要障碍。这个问题的核心是用户表达信息需求的方式和数据表示之间的语义差距。这项工作旨在提供一个自然语言接口和一个相关的语义索引，以支持对关联数据/语义Web数据集查询的词汇独立性的提高，使用分布组合语义方法。分布语义学关注的是基于大规模文本中共现词的统计分布自动构建语义模型。提出的查询模型针对以下特征:(i)一种原则性的语义近似方法，具有低适应性(独立于手动创建的资源，如本体，词典或字典);(ii)通过将大量分布(非结构化)常识包含到语义近似过程中来支持全面的语义匹配;(iii)表达自然语言查询。该方法在开放域数据集上使用自然语言查询进行评估，平均召回率=0.81，平均平均精度=0.62，平均倒数秩=0.49。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach

The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying heterogeneous and distributed third-party databases can create major barriers for data consumers. At the core of this problem is the semantic gap between the way users express their information needs and the representation of the data. This work aims to provide a natural language interface and an associated semantic index to support an increased level of vocabulary independency for queries over Linked Data/Semantic Web datasets, using a distributional-compositional semantics approach. Distributional semantics focuses on the automatic construction of a semantic model based on the statistical distribution of co-occurring words in large-scale texts. The proposed query model targets the following features: (i) a principled semantic approximation approach with low adaptation effort (independent from manually created resources such as ontologies, thesauri or dictionaries), (ii) comprehensive semantic matching supported by the inclusion of large volumes of distributional (unstructured) commonsense knowledge into the semantic approximation process and (iii) expressive natural language queries. The approach is evaluated using natural language queries on an open domain dataset and achieved avg. recall=0.81, mean avg. precision=0.62 and mean reciprocal rank=0.49.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 19th international conference on Intelligent User Interfaces

自引率

0.00%

发文量