Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献

筛选
英文 中文
Incorporating Non-sequential Behavior into Click Models 将非顺序行为合并到点击模型中
Chao Wang, Yiqun Liu, Meng Wang, K. Zhou, Jian-Yun Nie, Shaoping Ma
{"title":"Incorporating Non-sequential Behavior into Click Models","authors":"Chao Wang, Yiqun Liu, Meng Wang, K. Zhou, Jian-Yun Nie, Shaoping Ma","doi":"10.1145/2766462.2767712","DOIUrl":"https://doi.org/10.1145/2766462.2767712","url":null,"abstract":"Click-through information is considered as a valuable source of users' implicit relevance feedback. As user behavior is usually influenced by a number of factors such as position, presentation style and site reputation, researchers have proposed a variety of assumptions (i.e.~click models) to generate a reasonable estimation of result relevance. The construction of click models usually follow some hypotheses. For example, most existing click models follow the sequential examination hypothesis in which users examine results from top to bottom in a linear fashion. While these click models have been successful, many recent studies showed that there is a large proportion of non-sequential browsing (both examination and click) behaviors in Web search, which the previous models fail to cope with. In this paper, we investigate the problem of properly incorporating non-sequential behavior into click models. We firstly carry out a laboratory eye-tracking study to analyze user's non-sequential examination behavior and then propose a novel click model named Partially Sequential Click Model (PSCM) that captures the practical behavior of users. We compare PSCM with a number of existing click models using two real-world search engine logs. Experimental results show that PSCM outperforms other click models in terms of both predicting click behavior (perplexity) and estimating result relevance (NDCG and user preference test). We also publicize the implementations of PSCM and related datasets for possible future comparison studies.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114109046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
About the 'Compromised Information Need' and Optimal Interaction as Quality Measure for Search Interfaces 关于“妥协的信息需求”和最优交互作为搜索界面的质量度量
E. Hoenkamp
{"title":"About the 'Compromised Information Need' and Optimal Interaction as Quality Measure for Search Interfaces","authors":"E. Hoenkamp","doi":"10.1145/2766462.2767800","DOIUrl":"https://doi.org/10.1145/2766462.2767800","url":null,"abstract":"Taylor's concept of levels of information need has been cited in over a hundred IR publications since his work was first published. It concerns the phases a searcher goes through, starting with the feeling that information seems missing, to expressing a query to the system that hopefully will provide that information. As every year more IR publications reference Taylor's work, but none of these so much as attempt to formalize the concept they use, it is doubtful that the term is always used with the same connotation. Hence we propose a formal definition of levels of information need, as especially in IR with its formal underpinnings, there is no excuse to leave frequently used terms undefined. We cast Taylor's informally defined levels of information need --- and the transitions between them --- as an evolving dynamical system subsuming two subsystems: the searcher and the search engine. This moves the focus from optimizing the search engine to optimizing the search interface. We define the quality of an interface by how much users need to compromise in order to fill their information need. We show how a theoretical optimum can be calculated that assumes the least compromise from the user. This optimum can be used to establish a base-line for measuring how much a search interface deviates from the ideal, given actual search behavior, and by the same token offers a measure of comparison among competing interfaces.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114899184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Probabilistic Model for Information Retrieval Based on Maximum Value Distribution 基于最大值分布的信息检索概率模型
Jiaul H. Paik
{"title":"A Probabilistic Model for Information Retrieval Based on Maximum Value Distribution","authors":"Jiaul H. Paik","doi":"10.1145/2766462.2767762","DOIUrl":"https://doi.org/10.1145/2766462.2767762","url":null,"abstract":"The main goal of a retrieval model is to measure the degree of relevance of a document with respect to the given query. Probabilistic models are widely used to measure the likelihood of relevance of a document by combining within document term frequency and term specificity in a formal way. Recent research shows that tf normalization that factors in multiple aspects of term salience is an effective scheme. However, existing models do not fully utilize these tf normalization components in a principled way. Moreover, most state of the art models ignore the distribution of a term in the part of the collection that contains the term. In this article, we introduce a new probabilistic model of ranking that addresses the above issues. We argue that, since the relevance of a document increases with the frequency of the query term, this assumption can be used to measure the likelihood that the normalized frequency of a term in a particular document will be maximum with respect to its distribution in the elite set. Thus, the weight of a term in a document is proportional to the probability that the normalized frequency of that term is maximum under the hypothesis that the frequencies are generated randomly. To that end, we introduce a ranking function based on maximum value distribution that uses two aspects of tf normalization. The merit of the proposed model is demonstrated on a number of recent large web collections. Results show that the proposed model outperforms the state of the art models by significantly large margin.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127782608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Priori Relevance Based On Quality and Diversity Of Social Signals 基于社会信号质量和多样性的先验关联
Ismail Badache, M. Boughanem
{"title":"A Priori Relevance Based On Quality and Diversity Of Social Signals","authors":"Ismail Badache, M. Boughanem","doi":"10.1145/2766462.2767807","DOIUrl":"https://doi.org/10.1145/2766462.2767807","url":null,"abstract":"Social signals (users' actions) associated with web resources (documents) can be considered as an additional information that can play a role to estimate a priori importance of the resource. In this paper, we are particularly interested in: first, showing the impact of signals diversity associated to a resource on information retrieval performance; second, studying the influence of their social networks origin on their quality. We propose to model these social features as prior that we integrate into language model. We evaluated the effectiveness of our approach on IMDb dataset containing 167438 resources and their social signals collected from several social networks. Our experimental results are statistically significant and show the interest of integrating signals diversity in the retrieval process.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126495416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Rank-GeoFM: A Ranking based Geographical Factorization Method for Point of Interest Recommendation Rank-GeoFM:一种基于地理因子排序的兴趣点推荐方法
Xutao Li, G. Cong, Xiaoli Li, T. Pham, S. Krishnaswamy
{"title":"Rank-GeoFM: A Ranking based Geographical Factorization Method for Point of Interest Recommendation","authors":"Xutao Li, G. Cong, Xiaoli Li, T. Pham, S. Krishnaswamy","doi":"10.1145/2766462.2767722","DOIUrl":"https://doi.org/10.1145/2766462.2767722","url":null,"abstract":"With the rapid growth of location-based social networks, Point of Interest (POI) recommendation has become an important research problem. However, the scarcity of the check-in data, a type of implicit feedback data, poses a severe challenge for existing POI recommendation methods. Moreover, different types of context information about POIs are available and how to leverage them becomes another challenge. In this paper, we propose a ranking based geographical factorization method, called Rank-GeoFM, for POI recommendation, which addresses the two challenges. In the proposed model, we consider that the check-in frequency characterizes users' visiting preference and learn the factorization by ranking the POIs correctly. In our model, POIs both with and without check-ins will contribute to learning the ranking and thus the data sparsity problem can be alleviated. In addition, our model can easily incorporate different types of context information, such as the geographical influence and temporal influence. We propose a stochastic gradient descent based algorithm to learn the factorization. Experiments on publicly available datasets under both user-POI setting and user-time-POI setting have been conducted to test the effectiveness of the proposed method. Experimental results under both settings show that the proposed method outperforms the state-of-the-art methods significantly in terms of recommendation accuracy.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122660760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 325
Opinion Spammer Detection in Web Forum 意见垃圾邮件检测在Web论坛
Yu-Ren Chen, Hsin-Hsi Chen
{"title":"Opinion Spammer Detection in Web Forum","authors":"Yu-Ren Chen, Hsin-Hsi Chen","doi":"10.1145/2766462.2767766","DOIUrl":"https://doi.org/10.1145/2766462.2767766","url":null,"abstract":"In this paper, a real case study on opinion spammer detection in web forum is presented. We explore user profiles, maximum spamicity of first posts of users, burstiness of registration of user accounts, and frequent poster set to build a model with SVM with RBF kernel and frequent itemset mining. The proposed model achieves 0.6753 precision, 0.6190 recall, and 0.6460 F1 score. The result is promising because the ratio of opinion spammers in the test set is only 0.98%.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121510840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Evaluating Streams of Evolving News Events 评估不断发展的新闻事件流
G. Baruah, Mark D. Smucker, C. Clarke
{"title":"Evaluating Streams of Evolving News Events","authors":"G. Baruah, Mark D. Smucker, C. Clarke","doi":"10.1145/2766462.2767751","DOIUrl":"https://doi.org/10.1145/2766462.2767751","url":null,"abstract":"People track news events according to their interests and available time. For a major event of great personal interest, they might check for updates several times an hour, taking time to keep abreast of all aspects of the evolving event. For minor events of more marginal interest, they might check back once or twice a day for a few minutes to learn about the most significant developments. Systems generating streams of updates about evolving events can improve user performance by appropriately filtering these updates, making it easy for users to track events in a timely manner without undue information overload. Unfortunately, predicting user performance on these systems poses a significant challenge. Standard evaluation methodology, designed for Web search and other adhoc retrieval tasks, adapts poorly to this context. In this paper, we develop a simple model that simulates users checking the system from time to time to read updates. For each simulated user, we generate a trace of their activities alternating between away times and reading times. These traces are then applied to measure system effectiveness. We test our model using data from the TREC 2013 Temporal Summarization Track (TST) comparing it to the effectiveness measures used in that track. The primary TST measure corresponds most closely with a modeled user that checks back once a day on average for an average of one minute. Users checking more frequently for longer times may view the relative performance of participating systems quite differently. In light of this sensitivity to user behavior, we recommend that future experiments be built around clearly stated assumptions regarding user interfaces and access patterns, with effectiveness measures reflecting these assumptions.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131545003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Unconscious Physiological Effects of Search Latency on Users and Their Click Behaviour 搜索延迟对用户及其点击行为的无意识生理影响
Miguel Barreda-Ángeles, I. Arapakis, Xiao Bai, B. B. Cambazoglu, Alexandre Pereda-Baños
{"title":"Unconscious Physiological Effects of Search Latency on Users and Their Click Behaviour","authors":"Miguel Barreda-Ángeles, I. Arapakis, Xiao Bai, B. B. Cambazoglu, Alexandre Pereda-Baños","doi":"10.1145/2766462.2767719","DOIUrl":"https://doi.org/10.1145/2766462.2767719","url":null,"abstract":"Understanding the impact of a search system's response latency on its users' searching behaviour has been recently an active research topic in the information retrieval and human-computer interaction areas. Along the same line, this paper focuses on the user impact of search latency and makes the following two contributions. First, through a controlled experiment, we reveal the physiological effects of response latency on users and show that these effects are present even at small increases in response latency. We compare these effects with the information gathered from self-reports and show that they capture the nuanced attentional and emotional reactions to latency much better. Second, we carry out a large-scale analysis using a web search query log obtained from Yahoo to understand the change in the way users engage with a web search engine under varying levels of increasing response latency. In particular, we analyse the change in the click behaviour of users when they are subject to increasing response latency and reveal significant behavioural differences.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128378034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Effective Latent Models for Binary Feedback in Recommender Systems 推荐系统中二元反馈的有效潜在模型
M. Volkovs, Guangwei Yu
{"title":"Effective Latent Models for Binary Feedback in Recommender Systems","authors":"M. Volkovs, Guangwei Yu","doi":"10.1145/2766462.2767716","DOIUrl":"https://doi.org/10.1145/2766462.2767716","url":null,"abstract":"In many collaborative filtering (CF) applications, latent approaches are the preferred model choice due to their ability to generate real-time recommendations efficiently. However, the majority of existing latent models are not designed for implicit binary feedback (views, clicks, plays etc.) and perform poorly on data of this type. Developing accurate models from implicit feedback is becoming increasingly important in CF since implicit feedback can often be collected at lower cost and in much larger quantities than explicit preferences. The need for accurate latent models for implicit data was further emphasized by the recently conducted Million Song Dataset Challenge organized by Kaggle. In this challenge, the results for the best latent model were orders of magnitude worse than neighbor-based approaches, and all the top performing teams exclusively used neighbor-based models. We address this problem and propose a new latent approach for binary feedback in CF. In our model, neighborhood similarity information is used to guide latent factorization and derive accurate latent representations. We show that even with simple factorization methods like SVD, our approach outperforms existing models and produces state-of-the-art results.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124653593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
On the Relation Between Assessor's Agreement and Accuracy in Gamified Relevance Assessment 游戏化关联评价中评价者一致性与准确性的关系
Olga Megorskaya, V. Kukushkin, P. Serdyukov
{"title":"On the Relation Between Assessor's Agreement and Accuracy in Gamified Relevance Assessment","authors":"Olga Megorskaya, V. Kukushkin, P. Serdyukov","doi":"10.1145/2766462.2767727","DOIUrl":"https://doi.org/10.1145/2766462.2767727","url":null,"abstract":"Expert judgments (labels) are widely used in Information Retrieval for the purposes of search quality evaluation and machine learning. Setting up the process of collecting such judgments is a challenge of its own, and the maintenance of judgments quality is an extremely important part of the process. One of the possible ways of controlling the quality is monitoring inter-assessor agreement level. But does the agreement level really reflect the quality of assessor's judgments? Indeed, if a group of assessors comes to a consensus, to what extent should we trust their collective opinion? In this paper, we investigate, whether the agreement level can be used as a metric for estimating the quality of assessor's judgments, and provide recommendations for the design of judgments collection workflow. Namely, we estimate the correlation between assessors' accuracy and agreement in the scope of several workflow designs and investigate which specific workflow features influence the accuracy of judgments the most.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129957980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信