Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献

筛选
英文 中文
MA-MRC: A Multi-answer Machine Reading Comprehension Dataset MA-MRC:多答案机器阅读理解数据集
Zhiang Yue, Jingping Liu, Cong Zhang, Chao Wang, Haiyun Jiang, Yue Zhang, Xianyang Tian, Zhedong Cen, Yanghua Xiao, Tong Ruan
{"title":"MA-MRC: A Multi-answer Machine Reading Comprehension Dataset","authors":"Zhiang Yue, Jingping Liu, Cong Zhang, Chao Wang, Haiyun Jiang, Yue Zhang, Xianyang Tian, Zhedong Cen, Yanghua Xiao, Tong Ruan","doi":"10.1145/3539618.3592015","DOIUrl":"https://doi.org/10.1145/3539618.3592015","url":null,"abstract":"Machine reading comprehension (MRC) is an essential task for many question-answering applications. However, existing MRC datasets mainly focus on data with single answer and overlook multiple answers, which are common in the real world. In this paper, we aim to construct an MRC dataset with both data of single answer and multiple answers. To achieve this purpose, we design a novel pipeline method: data collection, data cleaning, question generation and test set annotation. Based on these procedures, we construct a high-quality multi-answer MRC dataset (MA-MRC) with 129K question-answer-context samples. We implement a sequence of baselines and carry out extensive experiments on MA-MRC. According to the experimental results, MA-MRC is a challenging dataset, which can facilitate the future research on the multi-answer MRC task.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126885238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive Recommendation System for Meituan Waimai 美团外卖互动推荐系统
Chen Ji, Yacheng Li, Rui Li, Fei Jiang, Xiang Li, Wei Lin, Chenglong Zhang, Wei Wang, Shuya Wang
{"title":"Interactive Recommendation System for Meituan Waimai","authors":"Chen Ji, Yacheng Li, Rui Li, Fei Jiang, Xiang Li, Wei Lin, Chenglong Zhang, Wei Wang, Shuya Wang","doi":"10.1145/3539618.3591830","DOIUrl":"https://doi.org/10.1145/3539618.3591830","url":null,"abstract":"As the largest local retail & instant delivery platform in China, Meituan Waimai has deployed a personalized recommender system on server and recommend nearby stores to users through APP homepage. To capture real-time intention of users and flexibly adjust the recommendation results on the homepage, we further add an interactive recommender system. The existing interactive recommender systems in the industry mainly capture intention of users based on their feedback on a specific UI of questions. However, we find that it will undermine use fluency and increase use complexity by rashly inserting a new question UI when users browse the homepage. Therefore, we develop an Embedded Interactive Recommender System (EIRS) that directly infers users' intention according to their click behaviors on the homepage and dynamically inserts a new recommendation result into the homepage1. To demonstrate the effectiveness of EIRS, we conduct systematic online A/B Tests, where click-through & conversion rate of the inserted EIRS result is 132% higher than that of the initial result on the homepage, and the overall gross merchandise volume is effectively enhanced by 0.43%.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126674976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation 超越双塔匹配:学习用于推荐的稀疏可检索交叉交互
Liangcai Su, Fan Yan, Jieming Zhu, Xi Xiao, Haoyi Duan, Zhou Zhao, Zhenhua Dong, Ruiming Tang
{"title":"Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation","authors":"Liangcai Su, Fan Yan, Jieming Zhu, Xi Xiao, Haoyi Duan, Zhou Zhao, Zhenhua Dong, Ruiming Tang","doi":"10.1145/3539618.3591643","DOIUrl":"https://doi.org/10.1145/3539618.3591643","url":null,"abstract":"Two-tower models are a prevalent matching framework for recommendation, which have been widely deployed in industrial applications. The success of two-tower matching attributes to its efficiency in retrieval among a large number of items, since the item tower can be precomputed and used for fast Approximate Nearest Neighbor (ANN) search. However, it suffers two main challenges, including limited feature interaction capability and reduced accuracy in online serving. Existing approaches attempt to design novel late interactions instead of dot products, but they still fail to support complex feature interactions or lose retrieval efficiency. To address these challenges, we propose a new matching paradigm named SparCode, which supports not only sophisticated feature interactions but also efficient retrieval. Specifically, SparCode introduces an all-to-all interaction module to model fine-grained query-item interactions. Besides, we design a discrete code-based sparse inverted index jointly trained with the model to achieve effective and efficient model inference. Extensive experiments have been conducted on open benchmark datasets to demonstrate the superiority of our framework. The results show that SparCode significantly improves the accuracy of candidate item matching while retaining the same level of retrieval efficiency with two-tower models.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126935187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
XpmIR: A Modular Library for Learning to Rank and Neural IR Experiments XpmIR:一个学习排序和神经IR实验的模块化库
Yuxuan Zong, Benjamin Piwowarski
{"title":"XpmIR: A Modular Library for Learning to Rank and Neural IR Experiments","authors":"Yuxuan Zong, Benjamin Piwowarski","doi":"10.1145/3539618.3591818","DOIUrl":"https://doi.org/10.1145/3539618.3591818","url":null,"abstract":"During past years, several frameworks for (Neural) Information Retrieval have been proposed. However, while they allow reproducing already published results, it is still very hard to re-use some parts of the learning pipelines, such as for instance the pre-training, sampling strategy, or a loss in newly developed models. It is also difficult to use new training techniques with old models, which makes it more difficult to assess the usefulness of ideas on various neural IR models. This slows the adoption of new techniques, and in turn, the development of the IR field. In this paper, we present XpmIR, a Python library defining a reusable set of experimental components. The library already contains state-of-the-art models and indexation techniques and is integrated with the HuggingFace hub.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127008587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantifying Ranker Coverage of Different Query Subspaces 量化不同查询子空间的rank覆盖率
Negar Arabzadeh, A. Bigdeli, Radin Hamidi Rad, E. Bagheri
{"title":"Quantifying Ranker Coverage of Different Query Subspaces","authors":"Negar Arabzadeh, A. Bigdeli, Radin Hamidi Rad, E. Bagheri","doi":"10.1145/3539618.3592045","DOIUrl":"https://doi.org/10.1145/3539618.3592045","url":null,"abstract":"The information retrieval community has observed significant performance improvements over various tasks due to the introduction of neural architectures. However, such improvements do not necessarily seem to have happened uniformly across a range of queries. As we will empirically show in this paper, the performance of neural rankers follow a long-tail distribution where there are many subsets of queries, which are not effectively satisfied by neural methods. Despite this observation, performance is often reported using standard retrieval metrics, such as MRR or nDCG, which capture average performance over all queries. As such, it is not clear whether reported improvements are due to incremental boost on a small subset of already well-performing queries or addressing queries that have been difficult to address by existing methods. In this paper, we propose the Task Subspace Coverage (TaSC /tAHsk/) metric, which systematically quantifies whether and to what extent improvements in retrieval effectiveness happen on similar or disparate query subspaces for different rankers. Our experiments show that the consideration of our proposed TaSC metric in conjunction with existing ranking metrics provides deeper insight into ranker performance and their contribution to overall advances on a given task.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127606538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AutoTransfer: Instance Transfer for Cross-Domain Recommendations AutoTransfer:跨域建议的实例传输
Jingtong Gao, Xiangyu Zhao, Bo Chen, Fan Yan, Huifeng Guo, Ruiming Tang
{"title":"AutoTransfer: Instance Transfer for Cross-Domain Recommendations","authors":"Jingtong Gao, Xiangyu Zhao, Bo Chen, Fan Yan, Huifeng Guo, Ruiming Tang","doi":"10.1145/3539618.3591701","DOIUrl":"https://doi.org/10.1145/3539618.3591701","url":null,"abstract":"Cross-Domain Recommendation (CDR) is a widely used approach for leveraging information from domains with rich data to assist domains with insufficient data. A key challenge of CDR research is the effective and efficient transfer of helpful information from source domain to target domain. Currently, most existing CDR methods focus on extracting implicit information from the source domain to enhance the target domain. However, the hidden structure of the extracted implicit information is highly dependent on the specific CDR model, and is therefore not easily reusable or transferable. Additionally, the extracted implicit information only appears within the intermediate substructure of specific CDRs during training and is thus not easily retained for more use. In light of these challenges, this paper proposes AutoTransfer, with an Instance Transfer Policy Network, to selectively transfers instances from source domain to target domain for improved recommendations. Specifically, AutoTransfer acts as an agent that adaptively selects a subset of informative and transferable instances from the source domain. Notably, the selected subset possesses extraordinary re-utilization property that can be saved for improving model training of various future RS models in target domain. Experimental results on two public CDR benchmark datasets demonstrate that the proposed method outperforms state-of-the-art CDR baselines and classic Single-Domain Recommendation (SDR) approaches. The implementation code is available for easy reproduction.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127675150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Computational Versus Perceived Popularity Miscalibration in Recommender Systems 推荐系统中的计算误差与感知误差
Oleg Lesota, Gustavo Escobedo, Yashar Deldjoo, B. Ferwerda, Simone Kopeinik, E. Lex, Navid Rekabsaz, M. Schedl
{"title":"Computational Versus Perceived Popularity Miscalibration in Recommender Systems","authors":"Oleg Lesota, Gustavo Escobedo, Yashar Deldjoo, B. Ferwerda, Simone Kopeinik, E. Lex, Navid Rekabsaz, M. Schedl","doi":"10.1145/3539618.3591964","DOIUrl":"https://doi.org/10.1145/3539618.3591964","url":null,"abstract":"Popularity bias in recommendation lists refers to over-representation of popular content and is a challenge for many recommendation algorithms. Previous research has suggested several offline metrics to quantify popularity bias, which commonly relate the popularity of items in users' recommendation lists to the popularity of items in their interaction history. Discrepancies between these two factors are referred to as popularity miscalibration. While popularity metrics provide a straightforward and well-defined means to measure popularity bias, it is unknown whether they actually reflect users' perception of popularity bias. To address this research gap, we conduct a crowd-sourced user study on Prolific, involving 56 participants, to (1) investigate whether the level of perceived popularity miscalibration differs between common recommendation algorithms, (2) assess the correlation between perceived popularity miscalibration and its corresponding quantification according to a common offline metric. We conduct our study in a well-defined and important domain, namely music recommendation using the standardized LFM-2b dataset, and quantify popularity miscalibration of five recommendation algorithms by utilizing Jensen-Shannon distance (JSD). Challenging the findings of previous studies, we observe that users generally do perceive significant differences in terms of popularity bias between algorithms if this bias is framed as popularity miscalibration. In addition, JSD correlates moderately with users' perception of popularity, but not with their perception of unpopularity.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127887462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
StreamE: Learning to Update Representations for Temporal Knowledge Graphs in Streaming Scenarios 流:学习在流场景中更新时态知识图的表示
Jiasheng Zhang, Jie Shao, Bin Cui
{"title":"StreamE: Learning to Update Representations for Temporal Knowledge Graphs in Streaming Scenarios","authors":"Jiasheng Zhang, Jie Shao, Bin Cui","doi":"10.1145/3539618.3591772","DOIUrl":"https://doi.org/10.1145/3539618.3591772","url":null,"abstract":"Learning representations for temporal knowledge graphs (TKGs) is a fundamental task. Most existing methods regard TKG as a sequence of static snapshots and recurrently learn representations by retracing the previous snapshots. However, new knowledge can be continuously accrued to TKGs as streams. These methods either cannot handle new entities or fail to update representations in real time, making them unfeasible to adapt to the streaming scenarios. In this paper, we propose a lightweight framework called StreamE towards the efficient generation of TKG representations in streaming scenarios. To reduce the parameter size, entity representations in StreamE are decoupled from the model training to serve as the memory module to store the historical information of entities. To achieve efficient update and generation, the process of generating representations is decoupled as two functions in StreamE. An update function is learned to incrementally update entity representations based on the newly-arrived knowledge and a read function is learned to predict the future semantics of entity representations. The update function avoids the recurrent modeling paradigm and thus gains high efficiency while the read function considers multiple semantic change properties. We further propose a joint training strategy with two temporal regularizations to effectively optimize the framework. Experimental results show that StreamE can achieve better performance than baseline methods with 100x faster in inference, 25x faster in training, and only 1/5 parameter size, which demonstrates its superiority. Code is available at https://github.com/zjs123/StreamE.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129006197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The JOKER Corpus: English-French Parallel Data for Multilingual Wordplay Recognition JOKER语料库:用于多语言文字游戏识别的英法平行数据
Liana Ermakova, Anne-Gwenn Bosser, A. Jatowt, Tristan Miller
{"title":"The JOKER Corpus: English-French Parallel Data for Multilingual Wordplay Recognition","authors":"Liana Ermakova, Anne-Gwenn Bosser, A. Jatowt, Tristan Miller","doi":"10.1145/3539618.3591885","DOIUrl":"https://doi.org/10.1145/3539618.3591885","url":null,"abstract":"Despite recent advances in information retrieval and natural language processing, rhetorical devices that exploit ambiguity or subvert linguistic rules remain a challenge for such systems. However, corpus-based analysis of wordplay has been a perennial topic of scholarship in the humanities, including literary criticism, language education, and translation studies. The immense data-gathering effort required for these studies points to the need for specialized text retrieval and classification technology, and consequently for appropriate test collections. In this paper, we introduce and analyze a new dataset for research and applications in the retrieval and processing of wordplay. Developed for the JOKER track at CLEF 2023, our annotated corpus extends and improves upon past English wordplay detection datasets in several ways. First, we introduce hundreds of additional positive examples of wordplay; second, we provide French translations for the examples; and third, we provide negative examples of non-wordplay with characteristics closely matching those of the positive examples. This last feature helps ensure that AI models learn to effectively distinguish wordplay from non-wordplay, and not simply texts differing in length, style, or vocabulary. Our test collection represents then a step towards wordplay-aware multilingual information retrieval.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128718401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fine-Grained Preference-Aware Personalized Federated POI Recommendation with Data Sparsity 具有数据稀疏性的细粒度偏好感知个性化联邦POI推荐
Xiao Zhang, Ziming Ye, Jianfeng Lu, Fuzhen Zhuang, Yanwei Zheng, Dongxiao Yu
{"title":"Fine-Grained Preference-Aware Personalized Federated POI Recommendation with Data Sparsity","authors":"Xiao Zhang, Ziming Ye, Jianfeng Lu, Fuzhen Zhuang, Yanwei Zheng, Dongxiao Yu","doi":"10.1145/3539618.3591688","DOIUrl":"https://doi.org/10.1145/3539618.3591688","url":null,"abstract":"With the raised privacy concerns and rigorous data regulations, federated learning has become a hot collaborative learning paradigm for the recommendation model without sharing the highly sensitive POI data. However, the time-sensitive, heterogeneous, and limited POI records seriously restrict the development of federated POI recommendation. To this end, in this paper, we design the fine-grained preference-aware personalized federated POI recommendation framework, namely PrefFedPOI, under extremely sparse historical trajectories to address the above challenges. In details, PrefFedPOI extracts the fine-grained preference of current time slot by combining historical recent preferences and periodic preferences within each local client. Due to the extreme lack of POI data in some time slots, a data amount aware selective strategy is designed for model parameters uploading. Moreover, a performance enhanced clustering mechanism with reinforcement learning is proposed to capture the preference relatedness among all clients to encourage the positive knowledge sharing. Furthermore, a clustering teacher network is designed for improving efficiency by clustering guidance. Extensive experiments are conducted on two diverse real-world datasets to demonstrate the effectiveness of proposed PrefFedPOI comparing with state-of-the-arts. In particular, personalized PrefFedPOI can achieve 7% accuracy improvement on average among data-sparsity clients.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115935139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书