Concept Embedded Convolutional Semantic Model for Question Retrieval

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI:10.1145/3018661.3018687

P. Wang, Yong Zhang, Lei Ji, Jun Yan, Lianwen Jin

{"title":"Concept Embedded Convolutional Semantic Model for Question Retrieval","authors":"P. Wang, Yong Zhang, Lei Ji, Jun Yan, Lianwen Jin","doi":"10.1145/3018661.3018687","DOIUrl":null,"url":null,"abstract":"The question retrieval, which aims to find similar questions of a given question, is playing pivotal role in various question answering (QA) systems. This task is quite challenging mainly on three aspects: lexical gap, polysemy and word order. In this paper, we propose a unified framework to simultaneously handle these three problems. We use word combined with corresponding concept information to handle the polysemous problem. The concept embedding and word embedding are learned at the same time from both context-dependent and context-independent view. The lexical gap problem is handled since the semantic information has been encoded into the embedding. Then, we propose to use a high-level feature embedded convolutional semantic model to learn the question embedding by inputting the concept embedding and word embedding without manually labeling training data. The proposed framework nicely represent the hierarchical structures of word information and concept information in sentences with their layer-by-layer composition and pooling. Finally, the framework is trained in a weakly-supervised manner on question answer pairs, which can be directly obtained without manually labeling. Experiments on two real question answering datasets show that the proposed framework can significantly outperform the state-of-the-art solutions.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3018661.3018687","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

The question retrieval, which aims to find similar questions of a given question, is playing pivotal role in various question answering (QA) systems. This task is quite challenging mainly on three aspects: lexical gap, polysemy and word order. In this paper, we propose a unified framework to simultaneously handle these three problems. We use word combined with corresponding concept information to handle the polysemous problem. The concept embedding and word embedding are learned at the same time from both context-dependent and context-independent view. The lexical gap problem is handled since the semantic information has been encoded into the embedding. Then, we propose to use a high-level feature embedded convolutional semantic model to learn the question embedding by inputting the concept embedding and word embedding without manually labeling training data. The proposed framework nicely represent the hierarchical structures of word information and concept information in sentences with their layer-by-layer composition and pooling. Finally, the framework is trained in a weakly-supervised manner on question answer pairs, which can be directly obtained without manually labeling. Experiments on two real question answering datasets show that the proposed framework can significantly outperform the state-of-the-art solutions.

查看原文本刊更多论文

问题检索的概念嵌入卷积语义模型

问题检索在各种问答系统中起着至关重要的作用，其目的是找到给定问题的相似问题。这一任务的挑战性主要体现在三个方面:词汇缺口、多义性和语序。在本文中，我们提出了一个统一的框架来同时处理这三个问题。我们采用词与相应的概念信息相结合的方法来处理多义问题。概念嵌入和词嵌入从上下文依赖和上下文独立两个角度同时学习。由于语义信息已被编码到嵌入中，因此解决了词汇缺口问题。然后，我们提出了一种高级特征嵌入卷积语义模型，通过输入概念嵌入和词嵌入来学习问题嵌入，而无需手动标记训练数据。该框架通过逐层组合和池化，很好地表达了句子中词信息和概念信息的层次结构。最后，以弱监督的方式对问题答案对进行训练，无需手动标记即可直接获得答案对。在两个真实问答数据集上的实验表明，所提出的框架可以显著优于最先进的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量