A word clustering approach for language model-based sentence retrieval in question answering systems

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI:10.1145/1645953.1646263

S. Momtazi, D. Klakow

引用次数: 40

Abstract

In this paper we propose a term clustering approach to improve the performance of sentence retrieval in Question Answering (QA) systems. As the search in question answering is conducted over smaller segments of data than in a document retrieval task, the problems of data sparsity and exact matching become more critical. In this paper we propose Language Modeling (LM) techniques to overcome such problems and improve the sentence retrieval performance. Our proposed methods include building class-based models by term clustering, and then employing higher order n-grams with the new class-based model. We report our experiments on the TREC 2007 questions from QA track. The results show that the methods investigated here enhanced the mean average precision of sentence retrieval from 23.62% to 29.91%.

查看原文本刊更多论文

基于语言模型的问答系统句子检索中的词聚类方法

本文提出了一种术语聚类方法来提高问答系统中句子检索的性能。由于问答中的搜索是在比文档检索任务更小的数据段上进行的，因此数据稀疏性和精确匹配问题变得更加关键。本文提出语言建模(LM)技术来克服这些问题，提高句子检索的性能。我们提出的方法包括通过术语聚类建立基于类的模型，然后在新的基于类的模型中使用高阶n-grams。我们报告了我们在TREC 2007 QA轨道上的问题上的实验。结果表明，本文研究的方法将句子检索的平均准确率从23.62%提高到29.91%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 18th ACM conference on Information and knowledge management

自引率

0.00%

发文量