Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval最新文献

筛选
英文 中文
Efficient and effective solutions for search engines 高效和有效的解决方案的搜索引擎
Xiangfei Jia
{"title":"Efficient and effective solutions for search engines","authors":"Xiangfei Jia","doi":"10.1145/2009916.2010179","DOIUrl":"https://doi.org/10.1145/2009916.2010179","url":null,"abstract":"IR efficiency is normally addressed in terms of accumulator initialisation, disk I/O, decompression, ranking and sorting. In this paper, The bottlenecks of search engines are investigated and several solutions for efficiency optimisation are proposed. Particularly, the proposed heapk pruning algorithm is very efficient and effective. The future directions of this research work are also discussed.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131189548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Faster top-k document retrieval using block-max indexes 使用块最大索引更快地检索top-k文档
Shuai Ding, Torsten Suel
{"title":"Faster top-k document retrieval using block-max indexes","authors":"Shuai Ding, Torsten Suel","doi":"10.1145/2009916.2010048","DOIUrl":"https://doi.org/10.1145/2009916.2010048","url":null,"abstract":"Large search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization techniques called early termination achieves faster query processing by avoiding the scoring of documents that are unlikely to be in the top results. We study new algorithms for early termination that outperform previous methods. In particular, we focus on safe techniques for disjunctive queries, which return the same result as an exhaustive evaluation over the disjunction of the query terms. The current state-of-the-art methods for this case, the WAND algorithm by Broder et al. [11] and the approach of Strohman and Croft [30], achieve great benefits but still leave a large performance gap between disjunctive and (even non-early terminated) conjunctive queries. We propose a new set of algorithms by introducing a simple augmented inverted index structure called a block-max index. Essentially, this is a structure that stores the maximum impact score for each block of a compressed inverted list in uncompressed form, thus enabling us to skip large parts of the lists. We show how to integrate this structure into the WAND approach, leading to considerable performance gains. We then describe extensions to a layered index organization, and to indexes with reassigned document IDs, that achieve additional gains that narrow the gap between disjunctive and conjunctive top-k query processing.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115339971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 209
Associative tag recommendation exploiting multiple textual features 利用多种文本特征的关联标签推荐
F. Belém, E. Martins, Tatiana Pontes, J. Almeida, Marcos André Gonçalves
{"title":"Associative tag recommendation exploiting multiple textual features","authors":"F. Belém, E. Martins, Tatiana Pontes, J. Almeida, Marcos André Gonçalves","doi":"10.1145/2009916.2010053","DOIUrl":"https://doi.org/10.1145/2009916.2010053","url":null,"abstract":"This work addresses the task of recommending relevant tags to a target object by jointly exploiting three dimensions of the problem: (i) term co-occurrence with tags pre-assigned to the target object, (ii) terms extracted from multiple textual features, and (iii) several metrics of tag relevance. In particular, we propose several new heuristic methods, which extend state-of-the-art strategies by including new metrics that try to capture how accurately a candidate term describes the object's content. We also exploit two learning-to-rank (L2R) techniques, namely RankSVM and Genetic Programming, for the task of generating ranking functions that combine multiple metrics to accurately estimate the relevance of a tag to a given object. We evaluate all proposed methods in various scenarios for three popular Web 2.0 applications, namely, LastFM, YouTube and YahooVideo. We found that our new heuristics greatly outperform the methods on which they are based, producing gains in precision of up to 181%, as well as another state-of-the-art technique, with improvements in precision of up to 40% over the best baseline in any scenario. Further improvements can also be achieved with the new L2R strategies, which have the additional advantage of being quite flexible and extensible to exploit other aspects of the tag recommendation problem.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115695418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Predicting eBay listing conversion 预测eBay上市转化率
T. Yuan, Zhaohui Chen, Mike Mathieson
{"title":"Predicting eBay listing conversion","authors":"T. Yuan, Zhaohui Chen, Mike Mathieson","doi":"10.1145/2009916.2010188","DOIUrl":"https://doi.org/10.1145/2009916.2010188","url":null,"abstract":"At eBay Market Place, listing conversion rate can be measured by number of items sold divided by number of items in a sample set. For a given item, conversion rate can also be treated as the probability of sale. By investigating eBay listings' transactional patterns, as well as item attributes and user click-through data, we developed conversion models that allow us to predict a live listing's probability of sale. In this paper, we discuss the design and implementation of such conversion models. These models are highly valuable in analysis of inventory quality and ranking. Our work reveals the uniqueness of sales-oriented search at eBay and its similarity to general web search problems.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"294 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115767382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Personalized video: leanback online video consumption 个性化视频:leanback在线视频消费
Krishnan Ramanathan, Y. Sankarasubramaniam, Vidhya Govindaraju
{"title":"Personalized video: leanback online video consumption","authors":"Krishnan Ramanathan, Y. Sankarasubramaniam, Vidhya Govindaraju","doi":"10.1145/2009916.2010158","DOIUrl":"https://doi.org/10.1145/2009916.2010158","url":null,"abstract":"Current user interfaces for online video consumption are mostly browser based, lean forward, require constant interaction and provide a fragmented view of the total content available. For easier consumption, the user interface and interactions need to be redesigned for less interruptive and lean back experience. In this paper, we describe Personalized Video, an application that converts the online video experience into a personalized lean back experience. It has been implemented on the Windows platform and integrated with intuitive user interactions like gesture and face recognition. It also supports group personalization for concurrent users.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116444234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
iMecho: a context-aware desktop search system iMecho:一个上下文感知的桌面搜索系统
Jidong Chen, Hang Guo, Wentao Wu, Wei Wang
{"title":"iMecho: a context-aware desktop search system","authors":"Jidong Chen, Hang Guo, Wentao Wu, Wei Wang","doi":"10.1145/2009916.2010154","DOIUrl":"https://doi.org/10.1145/2009916.2010154","url":null,"abstract":"In this demo, we present iMecho, a context-aware desktop search system to help users get more relevant results. Different from other desktop search engines, iMecho ranks results not only by the content of the query, but also the context of the query. It employs an Hidden Markov Model (HMM)-based user model, which is learned from user's activity logs, to estimate the query context when he submits the query. The results from keyword search are re-ranked by their relevances to the context with acceptable overhead.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123516017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multi-objective optimization in learning to rank 排序学习中的多目标优化
Na Dai, Milad Shokouhi, Brian D. Davison
{"title":"Multi-objective optimization in learning to rank","authors":"Na Dai, Milad Shokouhi, Brian D. Davison","doi":"10.1145/2009916.2010139","DOIUrl":"https://doi.org/10.1145/2009916.2010139","url":null,"abstract":"Supervised learning to rank algorithms typically optimize for high relevance and ignore other facets of search quality, such as freshness and diversity. Prior work on multi-objective ranking trained rankers focused on using hybrid labels that combine overall quality of documents, and implicitly incorporate multiple criteria into quantifying ranking risks. However, these hybrid scores are usually generated based on heuristics without considering potential correlations between individual facets (e.g., freshness versus relevance). In this poster, we empirically demonstrate that the correlation between objective facets in multi-criteria ranking optimization may significantly influence the effectiveness of trained rankers with respect to each objective.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117184204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Do IR models satisfy the TDC retrieval constraint IR模型是否满足TDC检索约束
S. Clinchant, Éric Gaussier
{"title":"Do IR models satisfy the TDC retrieval constraint","authors":"S. Clinchant, Éric Gaussier","doi":"10.1145/2009916.2010096","DOIUrl":"https://doi.org/10.1145/2009916.2010096","url":null,"abstract":"If it has been show in previous studies (as [5, 2]) that most IR models satisfy most IR constraints, the situation of the TDC constraint is unclear, and the goal of this short paper is to show that several state-of-the-art IR models indeed do not comply with the general TDC constraint, but do satisfy the speTDC one. We will review here the recently intro- duced log-logistic model [2], as well as the Jelinek-Mercer and Dirichlet language models.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125740766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Enriching document representation via translation for improved monolingual information retrieval 通过翻译丰富文档表示以改进单语信息检索
Seung-Hoon Na, H. Ng
{"title":"Enriching document representation via translation for improved monolingual information retrieval","authors":"Seung-Hoon Na, H. Ng","doi":"10.1145/2009916.2010030","DOIUrl":"https://doi.org/10.1145/2009916.2010030","url":null,"abstract":"Word ambiguity and vocabulary mismatch are critical problems in information retrieval. To deal with these problems, this paper proposes the use of translated words to enrich document representation, going beyond the words in the original source language to represent a document. In our approach, each original document is automatically translated into an auxiliary language, and the resulting translated document serves as a semantically enhanced representation for supplementing the original bag of words. The core of our translation representation is the expected term frequency of a word in a translated document, which is calculated by averaging the term frequencies over all possible translations, rather than focusing on the 1-best translation only. To achieve better efficiency of translation, we do not rely on full-fledged machine translation, but instead use monotonic translation by removing the time-consuming reordering component. Experiments carried out on standard TREC test collections show that our proposed translation representation leads to statistically significant improvements over using only the original language of the document collection.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128752345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Scalable multi-dimensional user intent identification using tree structured distributions 使用树状结构分布的可扩展多维用户意图识别
Vinay Jethava, Liliana Calderón-Benavides, R. Baeza-Yates, C. Bhattacharyya, Devdatt P. Dubhashi
{"title":"Scalable multi-dimensional user intent identification using tree structured distributions","authors":"Vinay Jethava, Liliana Calderón-Benavides, R. Baeza-Yates, C. Bhattacharyya, Devdatt P. Dubhashi","doi":"10.1145/2009916.2009971","DOIUrl":"https://doi.org/10.1145/2009916.2009971","url":null,"abstract":"The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often not observed from query words alone. Accurate identification of Intent from query words remains a challenging problem primarily because it is extremely difficult to discover these dimensions. The problem is often significantly compounded due to lack of representative training sample. We present a generic, extensible framework for learning the multi-dimensional representation of user intent from the query words. The approach models the latent relationships between facets using tree structured distribution which leads to an efficient and convergent algorithm, FastQ, for identifying the multi-faceted intent of users based on just the query words. We also incorporated WordNet to extend the system capabilities to queries which contain words that do not appear in the training data. Empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127259356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信