Liangda Li, Hongbo Deng, Anlei Dong, Yi Chang, H. Zha, R. Baeza-Yates
{"title":"Analyzing User's Sequential Behavior in Query Auto-Completion via Markov Processes","authors":"Liangda Li, Hongbo Deng, Anlei Dong, Yi Chang, H. Zha, R. Baeza-Yates","doi":"10.1145/2766462.2767723","DOIUrl":"https://doi.org/10.1145/2766462.2767723","url":null,"abstract":"Query auto-completion (QAC) plays an important role in assisting users typing less while submitting a query. The QAC engine generally offers a list of suggested queries that start with a user's input as a prefix, and the list of suggestions is changed to match the updated input after the user types each keystroke. Therefore rich user interactions can be observed along with each keystroke until a user clicks a suggestion or types the entire query manually. It becomes increasingly important to analyze and understand users' interactions with the QAC engine, to improve its performance. Existing works on QAC either ignored users' interaction data, or assumed that their interactions at each keystroke are independent from others. Our paper pays high attention to users' sequential interactions with a QAC engine in and across QAC sessions, rather than users' interactions at each keystroke of each QAC session separately. Analyzing the dependencies in users' sequential interactions improves our understanding of the following three questions: 1) how is a user's skipping/viewing move at the current keystroke influenced by that at the previous keystroke? 2) how to improve search engines' query suggestions at short keystrokes based on those at latter long keystrokes? and 3) facing a targeted query shown in the suggestion list, why does a user decide to continue typing rather than click the intended suggestion? We propose a probabilistic model that addresses those three questions in a unified way, and illustrate how the model determines users' final click decisions. By comparing with state-of-the-art methods, our proposed model does suggest queries that better satisfy users' intents.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131254189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johanne R. Trippas, Damiano Spina, M. Sanderson, L. Cavedon
{"title":"Towards Understanding the Impact of Length in Web Search Result Summaries over a Speech-only Communication Channel","authors":"Johanne R. Trippas, Damiano Spina, M. Sanderson, L. Cavedon","doi":"10.1145/2766462.2767826","DOIUrl":"https://doi.org/10.1145/2766462.2767826","url":null,"abstract":"Presenting search results over a speech-only communication channel involves a number of challenges for users due to cognitive limitations and the serial nature of speech. We investigated the impact of search result summary length in speech-based web search, and compared our results to a text baseline. Based on crowdsourced workers, we found that users preferred longer, more informative summaries for text presentation. For audio, user preferences depended on the style of query. For single-facet queries, shortened audio summaries were preferred, additionally users were found to judge relevance with a similar accuracy compared to text-based summaries. For multi-facet queries, user preferences were not as clear, suggesting that more sophisticated techniques are required to handle such queries.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133317139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Multi-query Retrieval Tasks Using Density Matrix Transformation","authors":"Qiuchi Li, Jingfei Li, Peng Zhang, D. Song","doi":"10.1145/2766462.2767819","DOIUrl":"https://doi.org/10.1145/2766462.2767819","url":null,"abstract":"The quantum probabilistic framework has recently been applied to Information Retrieval (IR). A representative is the Quantum Language Model (QLM), which is developed for the ad-hoc retrieval with single queries and has achieved significant improvements over traditional language models. In QLM, a density matrix, defined on the quantum probabilistic space, is estimated as a representation of user's search intention with respect to a specific query. However, QLM is unable to capture the dynamics of user's information need in query history. This limitation restricts its further application on the dynamic search tasks, e.g., session search. In this paper, we propose a Session-based Quantum Language Model (SQLM) that deals with multi-query session search task. In SQLM, a transformation model of density matrices is proposed to model the evolution of user's information need in response to the user's interaction with search engine, by incorporating features extracted from both positive feedback (clicked documents) and negative feedback (skipped documents). Extensive experiments conducted on TREC 2013 and 2014 session track data demonstrate the effectiveness of SQLM in comparison with the classic QLM.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114316445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Time Pressure in Information Search","authors":"Anita Crescenzi","doi":"10.1145/2766462.2767851","DOIUrl":"https://doi.org/10.1145/2766462.2767851","url":null,"abstract":"The primary purpose of this research is to explore the impact of perceived time pressure on search behaviors, searcher perceptions of the search system and the search experience. Are there observable behavioral changes when a searcher is time-pressured? To what extent are search behavior differences attributable to objective experimental manipulation versus to the subjective experience of time pressure? An important secondary purpose of this work is to identify appropriate outcome measures that allow for the comparison of session-level search behaviors when time is manipulated.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114307612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relevance Scores for Triples from Type-Like Relations","authors":"H. Bast, Björn Buchhold, Elmar Haussmann","doi":"10.1145/2766462.2767734","DOIUrl":"https://doi.org/10.1145/2766462.2767734","url":null,"abstract":"We compute and evaluate relevance scores for knowledge-base triples from type-like relations. Such a score measures the degree to which an entity \"belongs\" to a type. For example, Quentin Tarantino has various professions, including Film Director, Screenwriter, and Actor. The first two would get a high score in our setting, because those are his main professions. The third would get a low score, because he mostly had cameo appearances in his own movies. Such scores are essential in the ranking for entity queries, e.g. \"American actors\" or \"Quentin Tarantino professions\". These scores are different from scores for \"correctness\" or \"accuracy\" (all three professions above are correct and accurate). We propose a variety of algorithms to compute these scores. For our evaluation we designed a new benchmark, which includes a ground truth based on about 14K human judgments obtained via crowdsourcing. Inter-judge agreement is slightly over 90%. Existing approaches from the literature give results far from the optimum. Our best algorithms achieve an agreement of about 80% with the ground truth.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116310516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalizing Search on Shared Devices","authors":"Ryen W. White, Ahmed Hassan Awadallah","doi":"10.1145/2766462.2767736","DOIUrl":"https://doi.org/10.1145/2766462.2767736","url":null,"abstract":"Search personalization tailors the search experience to individual searchers. To do this, search engines construct interest models comprising signals from observed behavior associated with ma-chines, often via Web browser cookies or other user identifiers. However, shared device usage is common, meaning that the activities of multiple searchers may be interwoven in the interest models generated. Recent research on activity attribution has led to methods to automatically disentangle the histories of multiple searchers and correctly ascribe newly-observed search activity to the correct per-son. Building on this, we introduce attribution-based personalization (ABP), a procedure that extends traditional personalization to target individual searchers on shared devices. Activity attribution may improve personalization, but its benefits are not yet fully understood. We present an oracle study (with perfect knowledge of which searchers perform each action on each machine) to under-stand the effectiveness of ABP in predicting searchers' future interests. We utilize a large Web search log dataset containing both per-son identifiers and machine identifiers to quantify the gain in personalization performance from ABP, identify the circumstances under which ABP is most effective, and develop a classifier to determine when to apply it that yields sizable gains in personalization performance. ABP allows search providers to personalize experiences for individuals rather than targeting all users of a device collectively.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116415460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonathan J. Dorando, Konstantine Arkoudas, P. Vasa, Gary Kazantsev, Gideon Mann
{"title":"Finding Money in the Haystack: Information Retrieval at Bloomberg","authors":"Jonathan J. Dorando, Konstantine Arkoudas, P. Vasa, Gary Kazantsev, Gideon Mann","doi":"10.1145/2766462.2776782","DOIUrl":"https://doi.org/10.1145/2766462.2776782","url":null,"abstract":"The financial markets are a rich domain for search, and it is not simple to serving the entire scope of financial professionals, who make their living on accurate, timely, and deep information. The data sources are many and disparate. This includes domains with rich structured data such as company and security attributes, textual data like research reports, and time sensitive news stories. Not only is the domain complicated, but some of the techniques that work for web search have to be adapted and reconsidered in an enterprise context with fewer eyeballs but just as complicated questions. At Bloomberg, we have been addressing these problems over the past four years in the search and discoverability group, heavily leveraging the insights from the academic and open-source communities to apply to our problems. We'll discuss about our efforts in Natural Language Question & Answer (NLQA), learning to rank, federated search, crowd sourcing, and how this all comes together to make search effective for our users.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123647694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiqun Liu, Ye Chen, Jinhui Tang, Jiashen Sun, Min Zhang, Shaoping Ma, Xuan Zhu
{"title":"Different Users, Different Opinions: Predicting Search Satisfaction with Mouse Movement Information","authors":"Yiqun Liu, Ye Chen, Jinhui Tang, Jiashen Sun, Min Zhang, Shaoping Ma, Xuan Zhu","doi":"10.1145/2766462.2767721","DOIUrl":"https://doi.org/10.1145/2766462.2767721","url":null,"abstract":"Satisfaction prediction is one of the prime concerns in search performance evaluation. It is a non-trivial task for two major reasons: (1) The definition of satisfaction is rather subjective and different users may have different opinions in satisfaction judgement. (2) Most existing studies on satisfaction prediction mainly rely on users' click-through or query reformulation behaviors but there are many sessions without such kind of interactions. To shed light on these research questions, we construct an experimental search engine that could collect users' satisfaction feedback as well as mouse click-through/movement data. Different from existing studies, we compare for the first time search users' and external assessors' opinions on satisfaction. We find that search users pay more attention to the utility of results while external assessors emphasize on the efforts spent in search sessions. Inspired by recent studies in predicting result relevance based on mouse movement patterns (namely motifs), we propose to estimate the utilities of search results and the efforts in search sessions with motifs extracted from mouse movement data on search result pages (SERPs). Besides the existing frequency-based motif selection method, two novel selection strategies (distance-based and distribution-based) are also adopted to extract high quality motifs for satisfaction prediction. Experimental results on over 1,000 user sessions show that the proposed strategies outperform existing methods and also have promising generalization capability for different users and queries.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125901368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reachability based Ranking in Interactive Image Retrieval","authors":"Jiyi Li","doi":"10.1145/2766462.2767777","DOIUrl":"https://doi.org/10.1145/2766462.2767777","url":null,"abstract":"In some interactive image retrieval systems, users can select images from image search results and click to view their similar or related images until they reach the targets. Existing image ranking options are based on relevance, update time, interestingness and so on. Because the inexact description of user targets or unsatisfying performance of image retrieval methods, it is possible that users cannot reach their targets in single-round interaction. When we consider multi-round interactions, how to assist users to select the images that are easier to reach the targets in fewer rounds is a useful issue. In this paper, we propose a new kind of ranking option to users by ranking the images according to their difficulties of reaching potential targets. We model the interactive image search behavior as navigation on information network constructed by an image collection and an image retrieval method. We use the properties of this information network for reachability based ranking. Experiments based on a social image collection show the efficiency of our approach.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"54 62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124705287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Term Location Information to Enhance Probabilistic Information Retrieval","authors":"Baiyan Liu, X. An, Xiangji Huang","doi":"10.1145/2766462.2767827","DOIUrl":"https://doi.org/10.1145/2766462.2767827","url":null,"abstract":"Nouns are more important than other parts of speech in information retrieval and are more often found near the beginning or the end of sentences. In this paper, we investigate the effects of rewarding terms based on their location in sentences on information retrieval. Particularly, we propose a novel Term Location (TEL) retrieval model based on BM25 to enhance probabilistic information retrieval, where a kernel-based method is used to capture term placement patterns. Experiments on five TREC datasets of varied size and content indicate the proposed model significantly outperforms the optimized BM25 and DirichletLM in MAP over all datasets with all kernel functions, and excels the optimized BM25 and DirichletLM over most of the datasets in P@5 and P@20 with different kernel functions.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128730192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}