基于主题词聚类的查询细化

RIAO Conference Pub Date : 2007-05-30 DOI:10.5555/1931390.1931405

Hiromi Wakaki, Tomonari Masada, A. Takasu, J. Adachi

{"title":"基于主题词聚类的查询细化","authors":"Hiromi Wakaki, Tomonari Masada, A. Takasu, J. Adachi","doi":"10.5555/1931390.1931405","DOIUrl":null,"url":null,"abstract":"We propose a method for supporting query refinement using topical term clusters. First, we propose a new term weighting method that can extract terms strongly related to a specific topic, because a document set retrieved with an ambiguous query may include divergent topics. Our formulation of term weighting is based on the statistics of term co-occurrence. Then, we generate term clusters using extracted terms, and rerank the documents in the search results by using each term cluster as a query. This clustering procedure is intended to isolate each topic as a set of related terms. In our experiments, we evaluated our term weighting method by checking: 1) whether each of the top-ranked document sets corresponds to one topic; and 2) whether some of the top-ranked document sets cover all the topics included in the synthesized document set. The results of our experiment show our method outperforms the existing term weighting methods MI, KLD, CHI-square and RSV.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Query Refinement based on Topical Term Clustering\",\"authors\":\"Hiromi Wakaki, Tomonari Masada, A. Takasu, J. Adachi\",\"doi\":\"10.5555/1931390.1931405\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a method for supporting query refinement using topical term clusters. First, we propose a new term weighting method that can extract terms strongly related to a specific topic, because a document set retrieved with an ambiguous query may include divergent topics. Our formulation of term weighting is based on the statistics of term co-occurrence. Then, we generate term clusters using extracted terms, and rerank the documents in the search results by using each term cluster as a query. This clustering procedure is intended to isolate each topic as a set of related terms. In our experiments, we evaluated our term weighting method by checking: 1) whether each of the top-ranked document sets corresponds to one topic; and 2) whether some of the top-ranked document sets cover all the topics included in the synthesized document set. The results of our experiment show our method outperforms the existing term weighting methods MI, KLD, CHI-square and RSV.\",\"PeriodicalId\":120472,\"journal\":{\"name\":\"RIAO Conference\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"RIAO Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5555/1931390.1931405\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"RIAO Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/1931390.1931405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们提出了一种使用主题词聚类支持查询细化的方法。首先，我们提出了一种新的术语加权方法，该方法可以提取与特定主题强相关的术语，因为使用模糊查询检索的文档集可能包含不同的主题。我们的术语加权公式是基于术语共现的统计。然后，我们使用提取的术语生成术语集群，并使用每个术语集群作为查询在搜索结果中对文档重新排序。此聚类过程旨在将每个主题隔离为一组相关术语。在我们的实验中，我们通过检查来评估我们的术语加权方法:1)每个排名靠前的文档集是否对应一个主题;2)排名靠前的一些文档集是否涵盖了合成文档集中包含的所有主题。实验结果表明，我们的方法优于现有的术语加权方法MI、KLD、CHI-square和RSV。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Query Refinement based on Topical Term Clustering

We propose a method for supporting query refinement using topical term clusters. First, we propose a new term weighting method that can extract terms strongly related to a specific topic, because a document set retrieved with an ambiguous query may include divergent topics. Our formulation of term weighting is based on the statistics of term co-occurrence. Then, we generate term clusters using extracted terms, and rerank the documents in the search results by using each term cluster as a query. This clustering procedure is intended to isolate each topic as a set of related terms. In our experiments, we evaluated our term weighting method by checking: 1) whether each of the top-ranked document sets corresponds to one topic; and 2) whether some of the top-ranked document sets cover all the topics included in the synthesized document set. The results of our experiment show our method outperforms the existing term weighting methods MI, KLD, CHI-square and RSV.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

RIAO Conference

自引率

0.00%

发文量