将数据挖掘应用于伪相关反馈的高性能文本检索

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI:10.1109/ICDM.2006.22

Xiangji Huang, Y. Huang, M. Wen, Aijun An, Y. Liu, Josiah Poon

{"title":"将数据挖掘应用于伪相关反馈的高性能文本检索","authors":"Xiangji Huang, Y. Huang, M. Wen, Aijun An, Y. Liu, Josiah Poon","doi":"10.1109/ICDM.2006.22","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate the use of data mining, in particular the text classification and co-training techniques, to identify more relevant passages based on a small set of labeled passages obtained from the blind feedback of a retrieval system. The data mining results are used to expand query terms and to re-estimate some of the parameters used in a probabilistic weighting function. We evaluate the data mining based feedback method on the TREC HARD data set. The results show that data mining can be successfully applied to improve the text retrieval performance. We report our experimental findings in detail.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":"{\"title\":\"Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval\",\"authors\":\"Xiangji Huang, Y. Huang, M. Wen, Aijun An, Y. Liu, Josiah Poon\",\"doi\":\"10.1109/ICDM.2006.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we investigate the use of data mining, in particular the text classification and co-training techniques, to identify more relevant passages based on a small set of labeled passages obtained from the blind feedback of a retrieval system. The data mining results are used to expand query terms and to re-estimate some of the parameters used in a probabilistic weighting function. We evaluate the data mining based feedback method on the TREC HARD data set. The results show that data mining can be successfully applied to improve the text retrieval performance. We report our experimental findings in detail.\",\"PeriodicalId\":356443,\"journal\":{\"name\":\"Sixth International Conference on Data Mining (ICDM'06)\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"47\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sixth International Conference on Data Mining (ICDM'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2006.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Data Mining (ICDM'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2006.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 47

摘要

在本文中，我们研究了使用数据挖掘，特别是文本分类和协同训练技术，基于从检索系统的盲反馈中获得的一小组标记段落来识别更多相关的段落。数据挖掘结果用于扩展查询项并重新估计概率加权函数中使用的一些参数。我们在TREC HARD数据集上评估了基于数据挖掘的反馈方法。结果表明，数据挖掘可以有效地提高文本检索的性能。我们详细地报告了我们的实验结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval

In this paper, we investigate the use of data mining, in particular the text classification and co-training techniques, to identify more relevant passages based on a small set of labeled passages obtained from the blind feedback of a retrieval system. The data mining results are used to expand query terms and to re-estimate some of the parameters used in a probabilistic weighting function. We evaluate the data mining based feedback method on the TREC HARD data set. The results show that data mining can be successfully applied to improve the text retrieval performance. We report our experimental findings in detail.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sixth International Conference on Data Mining (ICDM'06)

自引率

0.00%

发文量