Knowledge-based Question Answering by Jointly Generating, Copying and Paraphrasing

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management Pub Date : 2017-11-06 DOI:10.1145/3132847.3133064

Shuguang Zhu, Xiang Cheng, Sen Su, S. Lang

{"title":"Knowledge-based Question Answering by Jointly Generating, Copying and Paraphrasing","authors":"Shuguang Zhu, Xiang Cheng, Sen Su, S. Lang","doi":"10.1145/3132847.3133064","DOIUrl":null,"url":null,"abstract":"With the development of large-scale knowledge bases, people are building systems which give simple answers to questions based on consolidate facts. In this paper, we focus on simple questions, which ask about only a subject and relation in the knowledge base. Observing that certain parts of a question usually overlap with names of its corresponding subject and relation in the knowledge base, we argue that a question is formed by a mixture of copying and generation. To model that, we propose a sequence-to-sequence (seq2seq) architecture which encodes a candidate subject-relation pair and decodes it into the given question, where the decoding probability is used to select the best candidate. In our decoder, the copying mode points the subject or relation and duplicates its name, while the generating mode summarizes the meaning of the subject-relation pair and produces a word to smooth the question. Realizing that although sometimes a subject or relation is pointed, different names or keywords might be used, we also incorporate a paraphrasing mode to supplement the copying mode using an automatically mined lexicon. Extensive experiments on the largest dataset exhibit our better performance compared with the state-of-the-art methods.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3133064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

With the development of large-scale knowledge bases, people are building systems which give simple answers to questions based on consolidate facts. In this paper, we focus on simple questions, which ask about only a subject and relation in the knowledge base. Observing that certain parts of a question usually overlap with names of its corresponding subject and relation in the knowledge base, we argue that a question is formed by a mixture of copying and generation. To model that, we propose a sequence-to-sequence (seq2seq) architecture which encodes a candidate subject-relation pair and decodes it into the given question, where the decoding probability is used to select the best candidate. In our decoder, the copying mode points the subject or relation and duplicates its name, while the generating mode summarizes the meaning of the subject-relation pair and produces a word to smooth the question. Realizing that although sometimes a subject or relation is pointed, different names or keywords might be used, we also incorporate a paraphrasing mode to supplement the copying mode using an automatically mined lexicon. Extensive experiments on the largest dataset exhibit our better performance compared with the state-of-the-art methods.

查看原文本刊更多论文

联合生成、复制和释义的知识问答

随着大规模知识库的发展，人们正在构建基于巩固的事实给出简单答案的系统。在本文中，我们关注的是简单的问题，这些问题只询问知识库中的一个主题和关系。观察到问题的某些部分通常与知识库中相应主题和关系的名称重叠，我们认为问题是由复制和生成混合形成的。为了对其进行建模，我们提出了一个序列到序列(seq2seq)架构，该架构对候选主题-关系对进行编码并将其解码为给定的问题，其中解码概率用于选择最佳候选者。在我们的解码器中，复制模式指向主语或关系并重复其名称，而生成模式则总结主语-关系对的意义并产生一个词来解决问题。意识到尽管有时主题或关系是指向的，但可能使用不同的名称或关键字，我们还使用自动挖掘的词典合并了释义模式来补充复制模式。在最大的数据集上进行的大量实验表明，与最先进的方法相比，我们的性能更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

自引率

0.00%

发文量