Query Refinement based on Topical Term Clustering

RIAO Conference Pub Date : 2007-05-30 DOI:10.5555/1931390.1931405

Hiromi Wakaki, Tomonari Masada, A. Takasu, J. Adachi

引用次数: 0

Abstract

We propose a method for supporting query refinement using topical term clusters. First, we propose a new term weighting method that can extract terms strongly related to a specific topic, because a document set retrieved with an ambiguous query may include divergent topics. Our formulation of term weighting is based on the statistics of term co-occurrence. Then, we generate term clusters using extracted terms, and rerank the documents in the search results by using each term cluster as a query. This clustering procedure is intended to isolate each topic as a set of related terms. In our experiments, we evaluated our term weighting method by checking: 1) whether each of the top-ranked document sets corresponds to one topic; and 2) whether some of the top-ranked document sets cover all the topics included in the synthesized document set. The results of our experiment show our method outperforms the existing term weighting methods MI, KLD, CHI-square and RSV.

查看原文本刊更多论文

基于主题词聚类的查询细化

我们提出了一种使用主题词聚类支持查询细化的方法。首先，我们提出了一种新的术语加权方法，该方法可以提取与特定主题强相关的术语，因为使用模糊查询检索的文档集可能包含不同的主题。我们的术语加权公式是基于术语共现的统计。然后，我们使用提取的术语生成术语集群，并使用每个术语集群作为查询在搜索结果中对文档重新排序。此聚类过程旨在将每个主题隔离为一组相关术语。在我们的实验中，我们通过检查来评估我们的术语加权方法:1)每个排名靠前的文档集是否对应一个主题;2)排名靠前的一些文档集是否涵盖了合成文档集中包含的所有主题。实验结果表明，我们的方法优于现有的术语加权方法MI、KLD、CHI-square和RSV。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

RIAO Conference

自引率

0.00%

发文量