Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval最新文献

筛选
英文 中文
Clarifying Questions in Math Information Retrieval 澄清数学信息检索中的问题
Behrooz Mansouri, Zahra Jahedibashiz
{"title":"Clarifying Questions in Math Information Retrieval","authors":"Behrooz Mansouri, Zahra Jahedibashiz","doi":"10.1145/3578337.3605123","DOIUrl":"https://doi.org/10.1145/3578337.3605123","url":null,"abstract":"One of the challenges of math information retrieval is the inherent ambiguity of mathematical notation. The use of various notations, symbols, and conventions can lead to ambiguities in math search queries, potentially causing confusion and errors. Therefore, asking clarifying questions in math search can help remove these ambiguities. Despite advances in incorporating clarifying questions for search, little is currently understood about the characteristics of these questions in math. This paper investigates math clarifying questions asked on the MathStackExchange community question answering platform, analyzing a total of 495,431 clarifying questions and their usefulness. The results of the analysis uncover specific patterns in useful clarifying questions that provide insight into the design considerations for future conversational math search systems. The formulae used in clarifying questions are closely related to those in the initial queries and are accompanied by common phrases, seeking for the missing information related to the formulae. Additionally, experiments utilizing clarifying questions for math search demonstrate the potential benefits of incorporating them alongside the original query.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115588697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting Condorcet Fusion 回顾孔多塞融合
Liron Tyomkin, Oren Kurland
{"title":"Revisiting Condorcet Fusion","authors":"Liron Tyomkin, Oren Kurland","doi":"10.1145/3578337.3605140","DOIUrl":"https://doi.org/10.1145/3578337.3605140","url":null,"abstract":"The fusion task is to aggregate ranked document lists retrieved for a query. The Condorcet voting criterion served as inspiration for a commonly used fusion method proposed by Montague and Aslam (2002). The method is stochastic as it is based on the QuickSort sorting algorithm. We empirically show that the performance of the method can substantially vary due to this stochastic aspect. We propose approaches that improve the performance robustness of this fusion method with respect to its stochastic nature. The resultant performance is on par with the state-of-the-art.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129241537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Hint Generation 自动提示生成
A. Jatowt, Calvin Gehrer, Michael Färber
{"title":"Automatic Hint Generation","authors":"A. Jatowt, Calvin Gehrer, Michael Färber","doi":"10.1145/3578337.3605119","DOIUrl":"https://doi.org/10.1145/3578337.3605119","url":null,"abstract":"At times when answers to user questions are readily and easily available (at essentially zero cost), it is important for humans to maintain their knowledge and strong reasoning capabilities. We believe that in many cases providing hints rather than final answers should be sufficient and beneficial for users as it requires thinking and stimulates learning as well as remembering processes. We propose in this paper a novel task of automatic hint generation that supports users in finding the correct answers to their questions without the need of looking the answers up. As the first attempt towards this new task, we design and implement an approach that uses Wikipedia to automatically provide hints for any input question-answer pair. We then evaluate our approach with a user group of 10 persons and demonstrate that the generated hints help users successfully answer more questions than when provided with baseline hints.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"265 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124212084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Content-Based Relevance Estimation in Retrieval Settings with Ranking-Incentivized Document Manipulations 排序激励文档操作检索设置中基于内容的相关性估计
Ziv Vasilisky, Oren Kurland, Moshe Tennenholtz, Fiana Raiber
{"title":"Content-Based Relevance Estimation in Retrieval Settings with Ranking-Incentivized Document Manipulations","authors":"Ziv Vasilisky, Oren Kurland, Moshe Tennenholtz, Fiana Raiber","doi":"10.1145/3578337.3605124","DOIUrl":"https://doi.org/10.1145/3578337.3605124","url":null,"abstract":"In retrieval settings such as the Web, many document authors are ranking incentivized: they opt to have their documents highly ranked for queries of interest. Consequently, they often respond to rankings by modifying their documents. These modifications can hurt retrieval effectiveness even if the resultant documents are of high quality. We present novel content-based relevance estimates which are \"ranking-incentives aware\"; that is, the underlying assumption is that content can be the result of ranking incentives rather than of pure authorship considerations. The suggested estimates are based on inducing information from past dynamics of the document corpus. Empirical evaluation attests to the clear merits of our most effective methods. For example, they substantially outperform state-of-the-art approaches that were not designed to address ranking-incentivized document manipulations.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126328073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Entity-Based Relevance Feedback for Document Retrieval 基于实体的文档检索相关反馈
Eilon Sheetrit, Fiana Raiber, Oren Kurland
{"title":"Entity-Based Relevance Feedback for Document Retrieval","authors":"Eilon Sheetrit, Fiana Raiber, Oren Kurland","doi":"10.1145/3578337.3605128","DOIUrl":"https://doi.org/10.1145/3578337.3605128","url":null,"abstract":"There is a long history of work on using relevance feedback for ad hoc document retrieval. The main types of relevance feedback studied thus far are for documents, passages and terms. We explore the merits of using relevance feedback provided for entities in an entity repository. We devise retrieval methods that can utilize relevance feedback provided for tokens whether entities or terms. Empirical evaluation shows that using entity relevance feedback falls short with respect to utilizing term feedback on average, but is much more effective for difficult queries. Furthermore, integrating term and entity relevance feedback is of clear merit; e.g., for augmenting minimal document feedback. We also contrast approaches to presenting entities and terms for soliciting relevance feedback.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133807494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessment of the Quality of Topic Models for Information Retrieval Applications 面向信息检索应用的主题模型质量评估
Meng Yuan, P. Lin, Lida Rashidi, J. Zobel
{"title":"Assessment of the Quality of Topic Models for Information Retrieval Applications","authors":"Meng Yuan, P. Lin, Lida Rashidi, J. Zobel","doi":"10.1145/3578337.3605118","DOIUrl":"https://doi.org/10.1145/3578337.3605118","url":null,"abstract":"Topic modelling is an approach to generation of descriptions of document collections as a set of topics where each has a distinct theme and documents are a blend of topics. It has been applied to retrieval in a range of ways, but there has been little prior work on measurement of whether the topics are descriptive in this context. Moreover, existing methods for assessment of topic quality do not consider how well individual documents are described. To address this issue we propose a new measure of topic quality, which we call specificity; the basis of this measure is the extent to which individual documents are described by a limited number of topics. We also propose a new experimental protocol for validating topic-quality measures, a 'noise dial' that quantifies the extent to which the measure's scores are altered as the topics are degraded by addition of noise. The principle of the mechanism is that a meaningful measure should produce low scores if the 'topics' are essentially random. We show that specificity is at least as effective as existing measures of topic quality and does not require external resources. While other measures relate only to topics, not to documents, we further show that specificity correlates to the extent to which topic models are informative in the retrieval process.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115465971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Effectiveness of Quantum Random Walk Model in Recommender Systems 量子随机漫步模型在推荐系统中的有效性
Hiroshi Wayama, Kazunari Sugiyama
{"title":"The Effectiveness of Quantum Random Walk Model in Recommender Systems","authors":"Hiroshi Wayama, Kazunari Sugiyama","doi":"10.1145/3578337.3605141","DOIUrl":"https://doi.org/10.1145/3578337.3605141","url":null,"abstract":"Graph Convolutional Networks (GCNs) are effective in providing more relevant items at higher rankings in recommender systems. However, in real-world scenarios, it is important to provide recommended items with diversity and novelty as well as relevance to each user's preference. Additionally, users often desire a wide range of recommendations not just based on their past search behaviors and histories. To enhance each user's satisfaction, it is important to develop a recommender system that provides much more relevant and diverse items. LightGCN can achieve this, which is a GCN-based recommender system that learns latent vectors of users and items using multiple layers of aggregation functions and an adjacency matrix. However, LightGCN often provides recommendations without diversity when the number of layers is insufficient. On the other hand, when the number is excessive, the accuracy declines, which is known as the over-smoothing problem. To overcome this, we propose a novel approach using a continuous-time quantum walk model derived from a quantum algorithm to reconstruct the user-item adjacency matrix of LightGCN, improving the relevance and diversity of recommendations.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134281229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Parrots and Sociopathic Liars (keynote) 评估鹦鹉和反社会说谎者(主题演讲)
T. Sakai
{"title":"Evaluating Parrots and Sociopathic Liars (keynote)","authors":"T. Sakai","doi":"10.1145/3578337.3605144","DOIUrl":"https://doi.org/10.1145/3578337.3605144","url":null,"abstract":"This talk builds on my SWAN (Schematised Weighted Average Nugget) paper published in May 2023, which discusses a generic framework for auditing a given textual conversational system. The framework assumes that conversation sessions have already been sampled through either human-in-the-loop experiments or user simulation, and is designed to handle task-oriented and non-task oriented conversations seamlessly. The arxiv paper also discussed a schema containing twenty (+1) criteria for scoring nuggets (i.e., factual statements and dialogue acts within each turn of the conversations) either manually or (semi)automatically. By ''parrots,'' I am referring to the stochastics parrots of Professor Emily M. Bender et al., i.e., large language models. By ''sociopathic liars,'' I am referring to the same thing, as Professor Shannon Bowen of the University of South Carolina describes them as follows. ''Sociopathic liars are the most damaging types of liars because they lie on a routine basis without conscience and often without reason. Whereas pathetic liars lie to get along, and narcissistic liars prevaricate to cover their inaction, drama, or ineptitude, sociopaths lie simply because they feel like it. Lying is easy for them, and they lie with no conscience or remorse.\" I would like to primarily discuss how researchers might be able to prevent conversational systems from doing harm to users, to labellers, and to society, rather than how we might evaluate good things that the systems might bring just to privileged people. Furthermore, I would like to argue that ICTIR is a perfect place for such a discussion.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121925336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Deep Generative Recommendation Method for Unbiased Learning from Implicit Feedback 基于隐式反馈的无偏学习深度生成推荐方法
SHASHANK GUPTA, Harrie Oosterhuis, M. de Rijke
{"title":"A Deep Generative Recommendation Method for Unbiased Learning from Implicit Feedback","authors":"SHASHANK GUPTA, Harrie Oosterhuis, M. de Rijke","doi":"10.1145/3578337.3605114","DOIUrl":"https://doi.org/10.1145/3578337.3605114","url":null,"abstract":"Variational autoencoders (VAEs) are the state-of-the-art model for recommendation with implicit feedback signals. Unfortunately, implicit feedback suffers from selection bias, e.g., popularity bias, position bias, etc., and as a result, training from such signals produces biased recommendation models. Existing methods for debiasing the learning process have not been applied in a generative setting. We address this gap by introducing an inverse propensity scoring (IPS) based method for training VAEs from implicit feedback data in an unbiased way. Our IPS-based estimator for the VAE training objective, VAE-IPS, is provably unbiased w.r.t. selection bias. Our experimental results show that the proposed VAE-IPS model reaches significantly higher performance than existing baselines. Our contributions enable practitioners to combine state-of-the-art VAE recommendation techniques with the advantages of bias mitigation for implicit feedback.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128704548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Transformer-based Query by Multiple Documents 基于层次转换器的多文档查询
Zhiqi Huang, Sheikh Muhammad Sarwar
{"title":"Hierarchical Transformer-based Query by Multiple Documents","authors":"Zhiqi Huang, Sheikh Muhammad Sarwar","doi":"10.1145/3578337.3605130","DOIUrl":"https://doi.org/10.1145/3578337.3605130","url":null,"abstract":"It is often difficult for users to form keywords to express their information needs, especially when they are not familiar with the domain of the articles of interest. Moreover, in some search scenarios, there is no explicit query for the search engine to work with. Query-By-Multiple-Documents (QBMD), in which the information needs are implicitly represented by a set of relevant documents addresses these retrieval scenarios. Unlike the keyword-based retrieval task, the query documents are treated as exemplars of a hidden query topic, but it is often the case that they can be relevant to multiple topics. In this paper, we present a Hierarchical Interaction-based (HINT) bi-encoder retrieval architecture that encodes a set of query documents and retrieval documents separately for the QBMD task. We design a hierarchical attention mechanism that allows the model to 1) encode long sequences efficiently and 2) learn the interactions at low-level and high-level semantics (e.g., tokens and paragraphs) across multiple documents. With contextualized representations, the final scoring is calculated based on a stratified late interaction, which ensures each query document contributes equally to the matching against the candidate document. We build a large-scale, weakly supervised QBMD retrieval dataset based on Wikipedia for model training. We evaluate the proposed model on both Query-By-Single-Document (QBSD) and QBMD tasks. For QBSD, we use a benchmark dataset for legal case retrieval. For QBMD, we transform standard keyword-based retrieval datasets into the QBMD setting. Our experimental results show that HINT significantly outperforms all competitive baselines.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117347553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信