Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval最新文献

筛选
英文 中文
Optimizing Hyper-Phrase Queries 优化超短语查询
Dhruv Gupta, K. Berberich
{"title":"Optimizing Hyper-Phrase Queries","authors":"Dhruv Gupta, K. Berberich","doi":"10.1145/3409256.3409827","DOIUrl":"https://doi.org/10.1145/3409256.3409827","url":null,"abstract":"A hyper-phrase query (HPQ) consists of a sequence of phrase sets. Such queries naturally arise when attempting to spot knowledge graph (KG) facts or sets of KG facts in large document collections to establish their provenance. Our approach addresses this challenge by proposing query operators to detect text regions in documents that correspond to the HPQ as combinations of n-grams and skip-grams. The optimization lies in identifying the most cost-efficient order of query operators that can be executed to identify the text regions containing the HPQ. We show the efficiency of our optimizations on spotting facts from Wikidata in document collections amounting to more than thirty million documents.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130359195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment Prediction using Attention on User-Specific Rating Distribution 基于用户特定评级分布的注意力情感预测
Ting Lin, Aixin Sun
{"title":"Sentiment Prediction using Attention on User-Specific Rating Distribution","authors":"Ting Lin, Aixin Sun","doi":"10.1145/3409256.3409826","DOIUrl":"https://doi.org/10.1145/3409256.3409826","url":null,"abstract":"For document-level sentiment prediction, many methods try to first capture opinion words then infer sentiments based on these words. We observe that different users may use same words to express different levels of satisfaction, e.g., 'great' may mean very satisfaction to some users, or simply a general description to others. Intuitively, we expect the choice of a sentiment expression follows a distribution specific to a user and her sentiment to a product. In this paper, we propose a hierarchical neural network model with user-specific rating distribution attention (H-URA) to learn document representation for sentiment prediction. Our model learns local sentiment distributions from a user's expression, at word-level and at sentence-level respectively. We also learn a global sentiment distribution by using both user and product information. The attention weight is then computed from the local and global sentiment distributions. Experimental results show superiority of our H-URA model compared to strong baselines on benchmark datasets.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116509579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unbiased Pairwise Learning from Biased Implicit Feedback 有偏内隐反馈的无偏两两学习
Yuta Saito
{"title":"Unbiased Pairwise Learning from Biased Implicit Feedback","authors":"Yuta Saito","doi":"10.1145/3409256.3409812","DOIUrl":"https://doi.org/10.1145/3409256.3409812","url":null,"abstract":"Implicit feedback is prevalent in real-world scenarios and is widely used in the construction of recommender systems. However, the application of implicit feedback data is much more complicated than its explicit counterpart because it provides only positive feedback, and we cannot know whether the non-interacted feedback is positive or negative. Furthermore, positive feedback for rare items is observed less frequently than popular items. The relevance of such rare items is often underestimated. Existing solutions to such challenges are subject to bias toward the ideal loss function of interest or accept a simple pointwise approach, which is inappropriate for a ranking task. In this study, we first define an ideal pairwise loss function defined using the ground-truth relevance parameters that should be used to optimize the ranking metrics. Subsequently, we propose a theoretically grounded unbiased estimator for this ideal pairwise loss and a corresponding algorithm, Unbiased Bayesian Personalized Ranking. A pairwise algorithm addressing the two major difficulties in using implicit feedback has yet to be investigated, and the proposed algorithm is the first pairwise method for solving these challenges in a theoretically principal manner. Through theoretical analysis, we provide the critical statistical properties of the proposed unbiased estimator and a practical variance reduction technique. Empirical evaluations using real-world datasets demonstrate the practical strength of our approach.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122843908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
A Mixed-Method Analysis of Text and Audio Search Interfaces with Varying Task Complexity 具有不同任务复杂度的文本和音频搜索接口的混合方法分析
Alexandra Vtyurina, C. Clarke, E. Law, Johanne R. Trippas, Horatiu Bota
{"title":"A Mixed-Method Analysis of Text and Audio Search Interfaces with Varying Task Complexity","authors":"Alexandra Vtyurina, C. Clarke, E. Law, Johanne R. Trippas, Horatiu Bota","doi":"10.1145/3409256.3409822","DOIUrl":"https://doi.org/10.1145/3409256.3409822","url":null,"abstract":"Voice-based assistants have become a popular tool for conducting web search, particularly for factoid question answering. However, for more complex web searches, their functionality remains limited, as does our understanding of the ways in which users can best interact with audio-based search results. In this paper, we compare and contrast user behaviour through the representation of search results over two mediums: text and audio. We begin by conducting a crowdsourced study exposing the differences in user selection of search results when those are presented in text and audio formats. We further confirm these differences and investigate the reasons behind them through a mixed-methods laboratory study. Through a qualitative analysis of the collected data, we produce a list of guidelines for an audio-based presentation of search results.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"91 31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128804121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The Effects of Learning Objectives on Searchers' Perceptions and Behaviors 学习目标对搜索者认知和行为的影响
Kelsey Urgo, Jaime Arguello, Robert G. Capra
{"title":"The Effects of Learning Objectives on Searchers' Perceptions and Behaviors","authors":"Kelsey Urgo, Jaime Arguello, Robert G. Capra","doi":"10.1145/3409256.3409815","DOIUrl":"https://doi.org/10.1145/3409256.3409815","url":null,"abstract":"In recent years, the \"search as learning\" community has argued that search systems should be designed to support learning. We report on a lab study in which we manipulated the learning objectives associated with search tasks assigned to participants. We manipulated learning objectives by leveraging Anderson and Krathwohl's taxonomy of learning (A&K's taxonomy)[2], which situates learning objectives at the intersection of two orthogonal dimensions: the cognitive process and the knowledge type dimension. Participants in our study completed tasks with learning objectives that varied across three cognitive processes (apply, evaluate, and create) and three knowledge types (factual, conceptual, and procedural knowledge). We focus on the effects of the tasks cognitive process and knowledge type on participants' pre-/post-task perceptions and search behaviors. Our results found that the three knowledge types considered in our study had a greater effect than the three cognitive processes. Specifically, conceptual knowledge tasks were perceived to be more difficult and required more search activity. We discuss implications for designing search systems that support learning.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126499389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Question Answering over Curated and Open Web Sources 通过策划和开放的网络资源进行问答
Rishiraj Saha Roy, Avishek Anand
{"title":"Question Answering over Curated and Open Web Sources","authors":"Rishiraj Saha Roy, Avishek Anand","doi":"10.1145/3409256.3409809","DOIUrl":"https://doi.org/10.1145/3409256.3409809","url":null,"abstract":"The last few years have seen an explosion of research on the topic of automated question answering (QA), spanning the communities of information retrieval, natural language processing, and artificial intelligence. This tutorial would cover the highlights of this really active period of growth for QA to give the audience a grasp over the families of algorithms that are currently being used. We partition research contributions by the underlying source from where answers are retrieved: curated knowledge graphs, unstructured text, or hybrid corpora. We choose this dimension of partitioning as it is the most discriminative when it comes to algorithm design. Other key dimensions are covered within each sub-topic: like the complexity of questions addressed, and degrees of explainability and interactivity introduced in the systems. We would conclude the tutorial with the most promising emerging trends in the expanse of QA, that would help new entrants into this field make the best decisions to take the community forward. This tutorial was recently presented at SIGIR 2020.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125599163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Interactive Evaluation of Conversational Agents: Reflections on the Impact of Search Task Design 会话代理的交互评价:对搜索任务设计影响的思考
Mateusz Dubiel, Martin Halvey, L. Azzopardi, Sylvain Daronnat
{"title":"Interactive Evaluation of Conversational Agents: Reflections on the Impact of Search Task Design","authors":"Mateusz Dubiel, Martin Halvey, L. Azzopardi, Sylvain Daronnat","doi":"10.1145/3409256.3409814","DOIUrl":"https://doi.org/10.1145/3409256.3409814","url":null,"abstract":"Undertaking an interactive evaluation of goal-oriented conversational agents (CAs) is challenging, it requires the search task to be realistic and relatable while accounting for the users cognitive limitations. In the current paper we discuss findings of two Wizard of Oz studies and provide our reflections regarding the impact of different interactive search task designs on participants? performance, satisfaction and cognitive workload. In the first study, we tasked participants with finding a cheapest flight that met a certain departure time. In the second study we added an additional criterion: \"travel time\" and asked participants to find a fight option that offered a good trade-off between price and travel time. We found that using search tasks where participants need to decide between several competing search criteria (price vs. time) led to a higher search involvement and lower variance in usability and cognitive workload ratings between different CAs. We hope that our results will provoke discussion on how to make the evaluation of voice-only goal-oriented CAs more reliable and ecologically valid.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"IA-20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126562138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs 基于知识图的问题回答的合成训练数据生成
Trond Linjordet, K. Balog
{"title":"Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs","authors":"Trond Linjordet, K. Balog","doi":"10.1145/3409256.3409836","DOIUrl":"https://doi.org/10.1145/3409256.3409836","url":null,"abstract":"Synthetic data generation is important to training and evaluating neural models for question answering over knowledge graphs. The quality of the data and the partitioning of the datasets into training, validation and test splits impact the performance of the models trained on this data. If the synthetic data generation depends on templates, as is the predominant approach for this task, there may be a leakage of information via a shared basis of templates across data splits if the partitioning is not performed hygienically. This paper investigates the extent of such information leakage across data splits, and the ability of trained models to generalize to test data when the leakage is controlled. We find that information leakage indeed occurs and that it affects performance. At the same time, the trained models do generalize to test data under the sanitized partitioning presented here. Importantly, these findings extend beyond the particular flavor of question answering task we studied and raise a series of difficult questions around template-based synthetic data generation that will necessitate additional research.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124501573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Analysing the Effect of Clarifying Questions on Document Ranking in Conversational Search 会话检索中澄清问题对文档排序的影响分析
Antonios Minas Krasakis, Mohammad Aliannejadi, Nikos Voskarides, E. Kanoulas
{"title":"Analysing the Effect of Clarifying Questions on Document Ranking in Conversational Search","authors":"Antonios Minas Krasakis, Mohammad Aliannejadi, Nikos Voskarides, E. Kanoulas","doi":"10.1145/3409256.3409817","DOIUrl":"https://doi.org/10.1145/3409256.3409817","url":null,"abstract":"Recent research on conversational search highlights the importance of mixed-initiative in conversations. To enable mixed-initiative, the system should be able to ask clarifying questions to the user. However, the ability of the underlying ranking models (which support conversational search) to account for these clarifying questions and answers has not been analysed when ranking documents, at large. To this end, we analyse the performance of a lexical ranking model on a conversational search dataset with clarifying questions. We investigate, both quantitatively and qualitatively, how different aspects of clarifying questions and user answers affect the quality of ranking. We argue that there needs to be some fine-grained treatment of the entire conversational round of clarification, based on the explicit feedback which is present in such mixed-initiative settings. Informed by our findings, we introduce a simple heuristic-based lexical baseline, that significantly outperforms the existing naive baselines. Our work aims to enhance our understanding of the challenges present in this particular task and inform the design of more appropriate conversational ranking models.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114906196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Declarative Experimentation in Information Retrieval using PyTerrier 利用PyTerrier进行信息检索的陈述性实验
C. Macdonald, N. Tonellotto
{"title":"Declarative Experimentation in Information Retrieval using PyTerrier","authors":"C. Macdonald, N. Tonellotto","doi":"10.1145/3409256.3409829","DOIUrl":"https://doi.org/10.1145/3409256.3409829","url":null,"abstract":"The advent of deep machine learning platforms such as Tensorflow and Pytorch, developed in expressive high-level languages such as Python, have allowed more expressive representations of deep neural network architectures. We argue that such a powerful formalism is missing in information retrieval (IR), and propose a framework called PyTerrier that allows advanced retrieval pipelines to be expressed, and evaluated, in a declarative manner close to their conceptual design. Like the aforementioned frameworks that compile deep learning experiments into primitive GPU operations, our framework targets IR platforms as backends in order to execute and evaluate retrieval pipelines. Further, we can automatically optimise the retrieval pipelines to increase their efficiency to suite a particular IR platform backend. Our experiments, conducted on TREC Robust and ClueWeb09 test collections, demonstrate the efficiency benefits of these optimisations for retrieval pipelines involving both the Anserini and Terrier IR platforms.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115767906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 85
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信