查询子主题挖掘，实现搜索结果多样化

2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA) Pub Date : 2014-08-01 DOI:10.1109/ICAICTA.2014.7005960

M. Z. Ullah, Masaki Aono

{"title":"查询子主题挖掘，实现搜索结果多样化","authors":"M. Z. Ullah, Masaki Aono","doi":"10.1109/ICAICTA.2014.7005960","DOIUrl":null,"url":null,"abstract":"Web search queries are usually short, ambiguous, and contain multiple aspects or subtopics. Different users may have different search intents (or information needs) when submitting the same query. The task of identifying the subtopics underlying a query has received much attention in recent years. In this paper, we propose a method that exploits query reformulations provided by three major Web search engines (WSEs) as a means to uncover different query subtopics. In this regard, we estimate the importance of the subtopics by introducing multiple query-dependent and query-independent features, and rank the subtopics by balancing relevancy and novelty. Our experiment with the NTCIR-10 INTENT-2 English Subtopic Mining test collection shows that our method outperforms all participants' methods in NTCIR-10 INTENT-2 task in terms of D#-nDCG@10.","PeriodicalId":173600,"journal":{"name":"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Query subtopic mining for search result diversification\",\"authors\":\"M. Z. Ullah, Masaki Aono\",\"doi\":\"10.1109/ICAICTA.2014.7005960\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web search queries are usually short, ambiguous, and contain multiple aspects or subtopics. Different users may have different search intents (or information needs) when submitting the same query. The task of identifying the subtopics underlying a query has received much attention in recent years. In this paper, we propose a method that exploits query reformulations provided by three major Web search engines (WSEs) as a means to uncover different query subtopics. In this regard, we estimate the importance of the subtopics by introducing multiple query-dependent and query-independent features, and rank the subtopics by balancing relevancy and novelty. Our experiment with the NTCIR-10 INTENT-2 English Subtopic Mining test collection shows that our method outperforms all participants' methods in NTCIR-10 INTENT-2 task in terms of D#-nDCG@10.\",\"PeriodicalId\":173600,\"journal\":{\"name\":\"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)\",\"volume\":\"120 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAICTA.2014.7005960\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICTA.2014.7005960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

Web搜索查询通常简短、不明确，并且包含多个方面或子主题。在提交相同的查询时，不同的用户可能有不同的搜索意图(或信息需求)。近年来，识别查询背后的子主题的任务受到了很多关注。在本文中，我们提出了一种利用三个主要Web搜索引擎(wse)提供的查询重新表述作为发现不同查询子主题的方法。在这方面，我们通过引入多个查询相关和查询无关的特征来估计子主题的重要性，并通过平衡相关性和新颖性来对子主题进行排名。我们对ntcirr -10 INTENT-2英语子主题挖掘测试集的实验表明，我们的方法在d# -nDCG@10方面优于所有参与者在ntcirr -10 INTENT-2任务中的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Query subtopic mining for search result diversification

Web search queries are usually short, ambiguous, and contain multiple aspects or subtopics. Different users may have different search intents (or information needs) when submitting the same query. The task of identifying the subtopics underlying a query has received much attention in recent years. In this paper, we propose a method that exploits query reformulations provided by three major Web search engines (WSEs) as a means to uncover different query subtopics. In this regard, we estimate the importance of the subtopics by introducing multiple query-dependent and query-independent features, and rank the subtopics by balancing relevancy and novelty. Our experiment with the NTCIR-10 INTENT-2 English Subtopic Mining test collection shows that our method outperforms all participants' methods in NTCIR-10 INTENT-2 task in terms of D#-nDCG@10.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)

自引率

0.00%

发文量