Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval最新文献

筛选
英文 中文
On theme location discovery for travelogue services 关于旅游日志服务的主题地点发现
Mao Ye, Rong Xiao, Wang-Chien Lee, Xing Xie
{"title":"On theme location discovery for travelogue services","authors":"Mao Ye, Rong Xiao, Wang-Chien Lee, Xing Xie","doi":"10.1145/2009916.2009980","DOIUrl":"https://doi.org/10.1145/2009916.2009980","url":null,"abstract":"In this paper, we aim to develop a travelogue service that discovers and conveys various travelogue digests, in form of theme locations, geographical scope, traveling trajectory and location snippet, to users. In this service, theme locations in a travelogue are the core information to discover. Thus we aim to address the problem of theme location discovery to enable the above travelogue services. Due to the inherent ambiguity of location relevance, we perform location relevance mining (LRM) in two complementary angles, relevance classification and relevance ranking, to provide comprehensive understanding of locations. Furthermore, we explore the textual (e.g., surrounding words) and geographical (e.g., geographical relationship among locations) features of locations to develop a co-training model for enhancement of classification performance. Built upon the mining result of LRM, we develop a series of techniques for provisioning of the aforementioned travelogue digests in our travelogue system. Finally, we conduct comprehensive experiments on collected travelogues to evaluate the performance of our location relevance mining techniques and demonstrate the effectiveness of the travelogue service.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126459149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Mining tags using social endorsement networks 使用社会认可网络挖掘标签
Theodoros Lappas, Kunal Punera, Tamás Sarlós
{"title":"Mining tags using social endorsement networks","authors":"Theodoros Lappas, Kunal Punera, Tamás Sarlós","doi":"10.1145/2009916.2009946","DOIUrl":"https://doi.org/10.1145/2009916.2009946","url":null,"abstract":"Entities on social systems, such as users on Twitter, and images on Flickr, are at the core of many interesting applications: they can be ranked in search results, recommended to users, or used in contextual advertising. Such applications assume knowledge of an entity's nature and characteristic attributes. An effective way to encode such knowledge is in the form of tags. An untagged entity is practically inaccessible, since it is hard to retrieve or interact with. To address this, some platforms allow users to manually tag entities. However,while such tags can be informative, they can oftentimes be inadequate, trivial, ambiguous, or even plain false. Numerous automated tagging methods have been proposed to address these issues. However,most of them require pre-existing high-quality tags or descriptive texts for every entity that needs to be tagged. In our work, we propose a method based on social endorsements that is free from such constraints. Virtually every major social networking platform allows users to endorse entities that they find appealing. Examples include \"following\" Twitter users or \"favoriting\" Flickr photos. These endorsements are abundant and directly capture the preferences of users. In this paper, we pose and solve the problem of using the underlying social endorsement network to extract useful tags for entities in a social system. Our work leverages techniques from topic modeling to capture the interests of users and then uses them to extract relevant and descriptive tags for the entities they endorse. We perform an extensive evaluation of our proposed approach on real large-scale datasets from both Twitter and Flickr, and show that it significantly outperforms meaningful and competitive baselines.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124136534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Effect of different docid orderings on dynamic pruning retrieval strategies 不同文献排序对动态剪枝检索策略的影响
N. Tonellotto, C. Macdonald, I. Ounis
{"title":"Effect of different docid orderings on dynamic pruning retrieval strategies","authors":"N. Tonellotto, C. Macdonald, I. Ounis","doi":"10.1145/2009916.2010108","DOIUrl":"https://doi.org/10.1145/2009916.2010108","url":null,"abstract":"Document-at-a-time (DAAT) dynamic pruning strategies for information retrieval systems such as MaxScore and Wand can increase querying efficiency without decreasing effectiveness. Both work on posting lists sorted by ascending document identifier (docid). The order in which docids are assigned -- and hence the order of postings in the posting lists -- is known to have a noticeable impact on posting list compression. However, the resulting impact on dynamic pruning strategies is not well understood. In this poster, we examine the impact on the efficiency of these strategies across different docid orderings, by experimenting using the TREC ClueWeb09 corpus. We find that while the number of postings scored by dynamic pruning strategies do not markedly vary for different docid orderings, the ordering still has a marked impact on mean query response time. Moreover, when docids are assigned by lexicographical URL ordering, the benefit to response time for is more pronounced for Wand than for MaxScore.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123809878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Image annotation based on recommendation model 基于推荐模型的图像标注
Zijia Lin, Guiguang Ding, Jianmin Wang
{"title":"Image annotation based on recommendation model","authors":"Zijia Lin, Guiguang Ding, Jianmin Wang","doi":"10.1145/2009916.2010067","DOIUrl":"https://doi.org/10.1145/2009916.2010067","url":null,"abstract":"In this paper, a novel approach based on recommendation model is proposed for automatic image annotation. For any to-be-annotated image, we first select some related images with tags from training dataset according to their visual similarity. And then we estimate the initial ratings for tags of the training images based on tag ranking method and construct a rating matrix. We also construct a trust matrix based on visual similarity with a k-NN strategy. Then a recommendation model is built on both matrices to rank candidate tags for the target image. The proposed approach is evaluated using two benchmark image datasets, and experimental results have indicated its effectiveness.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122292439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Disambiguating biomedical acronyms using EMIM 使用EMIM消除生物医学缩略语的歧义
Nut Limsopatham, Rodrygo L. T. Santos, C. Macdonald, I. Ounis
{"title":"Disambiguating biomedical acronyms using EMIM","authors":"Nut Limsopatham, Rodrygo L. T. Santos, C. Macdonald, I. Ounis","doi":"10.1145/2009916.2010125","DOIUrl":"https://doi.org/10.1145/2009916.2010125","url":null,"abstract":"Expanding a query with acronyms or their corresponding 'long-forms' has not been shown to provide consistent improvements in the biomedical IR literature. The major open issue with expanding acronyms in a query is their inherent ambiguity, as an acronym can refer to multiple long-forms. At the same time, a long-form identified in a query can be expanded with its acronym(s); however, some of these may be also ambiguous and lead to poor retrieval performance. In this work, we propose the use of the EMIM (Expected Mutual Information Measure) between a long-form and its abbreviated acronym to measure ambiguity. We experiment with expanding both acronyms and long-forms identified in the queries from the adhoc task of the TREC 2004 Genomics track. Our preliminary analysis shows the potential of both acronym and long-form expansions for biomedical IR.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133032782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Quantifying test collection quality based on the consistency of relevance judgements 基于相关判断的一致性来量化测试集合的质量
Falk Scholer, A. Turpin, M. Sanderson
{"title":"Quantifying test collection quality based on the consistency of relevance judgements","authors":"Falk Scholer, A. Turpin, M. Sanderson","doi":"10.1145/2009916.2010057","DOIUrl":"https://doi.org/10.1145/2009916.2010057","url":null,"abstract":"Relevance assessments are a key component for test collection-based evaluation of information retrieval systems. This paper reports on a feature of such collections that is used as a form of ground truth data to allow analysis of human assessment error. A wide range of test collections are retrospectively examined to determine how accurately assessors judge the relevance of documents. Our results demonstrate a high level of inconsistency across the collections studied. The level of irregularity is shown to vary across topics, with some showing a very high level of assessment error. We investigate possible influences on the error, and demonstrate that inconsistency in judging increases with time. While the level of detail in a topic specification does not appear to influence the errors that assessors make, judgements are significantly affected by the decisions made on previously seen similar documents. Assessors also display an assessment inertia. Alternate approaches to generating relevance judgements appear to reduce errors. A further investigation of the way that retrieval systems are ranked using sets of relevance judgements produced early and late in the judgement process reveals a consistent influence measured across the majority of examined test collections. We conclude that there is a clear value in examining, even inserting, ground truth data in test collections, and propose ways to help minimise the sources of inconsistency when creating future test collections.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132801523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 81
Handling data sparsity in collaborative filtering using emotion and semantic based features 使用基于情感和语义的特征处理协同过滤中的数据稀疏性
Yashar Moshfeghi, Benjamin Piwowarski, J. Jose
{"title":"Handling data sparsity in collaborative filtering using emotion and semantic based features","authors":"Yashar Moshfeghi, Benjamin Piwowarski, J. Jose","doi":"10.1145/2009916.2010001","DOIUrl":"https://doi.org/10.1145/2009916.2010001","url":null,"abstract":"Collaborative filtering (CF) aims to recommend items based on prior user interaction. Despite their success, CF techniques do not handle data sparsity well, especially in the case of the cold start problem where there is no past rating for an item. In this paper, we provide a framework, which is able to tackle such issues by considering item-related emotions and semantic data. In order to predict the rating of an item for a given user, this framework relies on an extension of Latent Dirichlet Allocation, and on gradient boosted trees for the final prediction. We apply this framework to movie recommendation and consider two emotion spaces extracted from the movie plot summary and the reviews, and three semantic spaces: actor, director, and genre. Experiments with the 100K and 1M MovieLens datasets show that including emotion and semantic information significantly improves the accuracy of prediction and improves upon the state-of-the-art CF techniques. We also analyse the importance of each feature space and describe some uncovered latent groups.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132490791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 109
A tool for comparative IR evaluation on component level 一个组件级比较IR评价工具
Thomas Wilhelm, Jens Kürsten, Maximilian Eibl
{"title":"A tool for comparative IR evaluation on component level","authors":"Thomas Wilhelm, Jens Kürsten, Maximilian Eibl","doi":"10.1145/2009916.2010165","DOIUrl":"https://doi.org/10.1145/2009916.2010165","url":null,"abstract":"1. MOTIVATION Experimental information retrieval (IR) evaluation is an important instrument to measure the effectiveness of novel methods. Although IR system complexity has grown over years, the general framework for evaluation remained unchanged since its first implementation in the 1960s. Test collections were growing from thousands to millions of documents. Regular reuse resulted in larger topic sets for evaluation. New business models for information access required novel interpretations of effectiveness measures. Nevertheless, most experimental evaluations still rely on an over 50 year old paradigm. Participants of a SIGIR workshop in 2009 [1] discussed the implementation of new methodological standards for evaluation. But at the same time they worried about practicable ways to implement them. A review about recent publications containing experimental evaluations supports this concern [2]. The study also presented a web-based platform for longitudinal evaluation. In a similar way, data from the past decade of CLEF evaluations have been released through the DIRECT system. While the operators of the latter system reported about 50 new users since the release of the data [3], no further contributions were recorded on the web-platform introduced in [2]. In our point of view archiving evaluation data for longitudinal analysis is a first important step. A next step is to develop a methodology that supports researchers in choosing appropriate baselines for comparison. This can be achieved by reporting evaluation results on component level [4] rather than on system level. An exemplary study was presented in [2], where the Indri system was tested with several components switched on or off. Following this idea, an approach to assess novel methods could be to compare to related components only. This would require the community to formally record details of system configurations in connection with experimental results. We suppose that transparent descriptions of system components used in experiments could help researchers in choosing appropriate baselines.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114547451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Bagging gradient-boosted trees for high precision, low variance ranking models 套袋梯度增强树用于高精度、低方差排序模型
Y. Ganjisaffar, R. Caruana, C. Lopes
{"title":"Bagging gradient-boosted trees for high precision, low variance ranking models","authors":"Y. Ganjisaffar, R. Caruana, C. Lopes","doi":"10.1145/2009916.2009932","DOIUrl":"https://doi.org/10.1145/2009916.2009932","url":null,"abstract":"Recent studies have shown that boosting provides excellent predictive performance across a wide variety of tasks. In Learning-to-rank, boosted models such as RankBoost and LambdaMART have been shown to be among the best performing learning methods based on evaluations on public data sets. In this paper, we show how the combination of bagging as a variance reduction technique and boosting as a bias reduction technique can result in very high precision and low variance ranking models. We perform thousands of parameter tuning experiments for LambdaMART to achieve a high precision boosting model. Then we show that a bagged ensemble of such LambdaMART boosted models results in higher accuracy ranking models while also reducing variance as much as 50%. We report our results on three public learning-to-rank data sets using four metrics. Bagged LamdbaMART outperforms all previously reported results on ten of the twelve comparisons, and bagged LambdaMART outperforms non-bagged LambdaMART on all twelve comparisons. For example, wrapping bagging around LambdaMART increases NDCG@1 from 0.4137 to 0.4200 on the MQ2007 data set; the best prior results in the literature for this data set is 0.4134 by RankBoost.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116768591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 197
Formulating effective questions for community-based question answering 为社区问答制定有效的问题
Saori Suzuki, Shin-ichi Nakayama, Hideo Joho
{"title":"Formulating effective questions for community-based question answering","authors":"Saori Suzuki, Shin-ichi Nakayama, Hideo Joho","doi":"10.1145/2009916.2010149","DOIUrl":"https://doi.org/10.1145/2009916.2010149","url":null,"abstract":"Community-based Question Answering (CQA) services have become a major venue for people's information seeking on the Web. However, many studies on CQA have focused on the prediction of the best answers for a given question. This paper looks into the formulation of effective questions in the context of CQA. In particular, we looked at effect of contextual factors appended to a basic question on the performance of submitted answers. This study analysed a total of 930 answers returned in response to 266 questions that were formulated by 46 participants. The results show that adding a questionnaire's personal and social attribute to the question helped improve the perceptions of answers both in information seeking questions and opinion seeking questions.","PeriodicalId":356580,"journal":{"name":"Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114734477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信