{"title":"Knowledge-based Question Answering by Jointly Generating, Copying and Paraphrasing","authors":"Shuguang Zhu, Xiang Cheng, Sen Su, S. Lang","doi":"10.1145/3132847.3133064","DOIUrl":null,"url":null,"abstract":"With the development of large-scale knowledge bases, people are building systems which give simple answers to questions based on consolidate facts. In this paper, we focus on simple questions, which ask about only a subject and relation in the knowledge base. Observing that certain parts of a question usually overlap with names of its corresponding subject and relation in the knowledge base, we argue that a question is formed by a mixture of copying and generation. To model that, we propose a sequence-to-sequence (seq2seq) architecture which encodes a candidate subject-relation pair and decodes it into the given question, where the decoding probability is used to select the best candidate. In our decoder, the copying mode points the subject or relation and duplicates its name, while the generating mode summarizes the meaning of the subject-relation pair and produces a word to smooth the question. Realizing that although sometimes a subject or relation is pointed, different names or keywords might be used, we also incorporate a paraphrasing mode to supplement the copying mode using an automatically mined lexicon. Extensive experiments on the largest dataset exhibit our better performance compared with the state-of-the-art methods.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3133064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
With the development of large-scale knowledge bases, people are building systems which give simple answers to questions based on consolidate facts. In this paper, we focus on simple questions, which ask about only a subject and relation in the knowledge base. Observing that certain parts of a question usually overlap with names of its corresponding subject and relation in the knowledge base, we argue that a question is formed by a mixture of copying and generation. To model that, we propose a sequence-to-sequence (seq2seq) architecture which encodes a candidate subject-relation pair and decodes it into the given question, where the decoding probability is used to select the best candidate. In our decoder, the copying mode points the subject or relation and duplicates its name, while the generating mode summarizes the meaning of the subject-relation pair and produces a word to smooth the question. Realizing that although sometimes a subject or relation is pointed, different names or keywords might be used, we also incorporate a paraphrasing mode to supplement the copying mode using an automatically mined lexicon. Extensive experiments on the largest dataset exhibit our better performance compared with the state-of-the-art methods.