Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献_第7页

An Effective Framework for Enhancing Query Answering in a Heterogeneous Data Lake 异构数据湖中增强查询应答的有效框架

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591637

Qin Yuan, Ye Yuan, Z. Wen, He Wang, Shiyuan Tang

引用次数: 0

Semantic-enhanced Modality-asymmetric Retrieval for Online E-commerce Search 面向在线电子商务搜索的语义增强模态非对称检索

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591863

Zhigong Zhou, Ning Ding, Xiaochuan Fan, Yue Shang, Yiming Qiu, Jingwei Zhuo, Zhiwei Ge, Songlin Wang, Lin Liu, Sulong Xu, Han Zhang

{"title":"Semantic-enhanced Modality-asymmetric Retrieval for Online E-commerce Search","authors":"Zhigong Zhou, Ning Ding, Xiaochuan Fan, Yue Shang, Yiming Qiu, Jingwei Zhuo, Zhiwei Ge, Songlin Wang, Lin Liu, Sulong Xu, Han Zhang","doi":"10.1145/3539618.3591863","DOIUrl":"https://doi.org/10.1145/3539618.3591863","url":null,"abstract":"Semantic retrieval, which retrieves semantically matched items given a textual query, has been an essential component to enhance system effectiveness in e-commerce search. In this paper, we study the multimodal retrieval problem, where the visual information (e.g, image) of item is leveraged as supplementary of textual information to enrich item representation and further improve retrieval performance. Though learning from cross-modality data has been studied extensively in tasks such as visual question answering or media summarization, multimodal retrieval remains a non-trivial and unsolved problem especially in the asymmetric scenario where the query is unimodal while the item is multimodal. In this paper, we propose a novel model named SMAR, which stands for Semantic-enhanced Modality-Asymmetric Retrieval, to tackle the problem of modality fusion and alignment in this kind of asymmetric scenario. Extensive experimental results on an industrial dataset show that the proposed model outperforms baseline models significantly in retrieval accuracy. We have open sourced our industrial dataset for the sake of reproducibility and future research works.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134355207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rating Prediction in Conversational Task Assistants with Behavioral and Conversational-Flow Features 具有行为和会话流特征的会话任务助手的评价预测

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3592048

Rafael Ferreira, David Semedo, João Magalhães

引用次数: 1

Multimodal Named Entity Recognition and Relation Extraction with Retrieval-Augmented Strategy 基于检索增强策略的多模态命名实体识别与关系提取

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591790

Xuming Hu

引用次数: 0

Calibration Learning for Few-shot Novel Product Description 小样本新产品描述的校准学习

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591959

Zheng Liu, Mingjing Wu, Bo Peng, Yichao Liu, Qi Peng, Chong Zou

引用次数: 0

Improving Programming Q&A with Neural Generative Augmentation 用神经生成增强改进编程问答

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591860

Suthee Chaidaroon, Xiao Zhang, Shruti Subramaniyam, Jeffrey Svajlenko, Tanya Shourya, I. Keivanloo, Ria Joy

{"title":"Improving Programming Q&A with Neural Generative Augmentation","authors":"Suthee Chaidaroon, Xiao Zhang, Shruti Subramaniyam, Jeffrey Svajlenko, Tanya Shourya, I. Keivanloo, Ria Joy","doi":"10.1145/3539618.3591860","DOIUrl":"https://doi.org/10.1145/3539618.3591860","url":null,"abstract":"Knowledge-intensive programming Q&A is an active research area in industry. Its application boosts developer productivity by aiding developers in quickly finding programming answers from the vast amount of information on the Internet. In this study, we propose ProQANS and its variants ReProQANS and ReAugProQANS to tackle programming Q&A. ProQANS is a neural search approach that leverages unlabeled data on the Internet (such as StackOverflow) to mitigate the cold-start problem. ReProQANS extends ProQANS by utilizing reformulated queries with a novel triplet loss. We further use an auxiliary generative model to augment the training queries, and design a novel dual triplet loss function to adapt these generated queries, to build another variant of ReProQANS termed as ReAugProQANS. In our empirical experiments, we show ReProQANS has the best performance when evaluated on the in-domain test set, while ReAugProQANS has the superior performance on the out-of-domain real programming questions, by outperforming the state-of-the-art model by up to 477% lift on the MRR metric respectively. The results suggest their robustness to previously unseen questions and its wide application to real programming questions.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123572275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AdaMCL: Adaptive Fusion Multi-View Contrastive Learning for Collaborative Filtering 协同过滤的自适应融合多视图对比学习

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591632

Guanghui Zhu, Wang Lu, C. Yuan, Y. Huang

{"title":"AdaMCL: Adaptive Fusion Multi-View Contrastive Learning for Collaborative Filtering","authors":"Guanghui Zhu, Wang Lu, C. Yuan, Y. Huang","doi":"10.1145/3539618.3591632","DOIUrl":"https://doi.org/10.1145/3539618.3591632","url":null,"abstract":"Graph collaborative filtering has achieved great success in capturing users' preferences over items. Despite effectiveness, graph neural network (GNN)-based methods suffer from data sparsity in real scenarios. Recently, contrastive learning (CL) has been used to address the problem of data sparsity. However, most CL-based methods only leverage the original user-item interaction graph to construct the CL task, lacking the explicit exploitation of the higher-order information (i.e., user-user and item-item relationships). Even for the CL-based method that uses the higher-order information, the reception field of the higher-order information is fixed and regardless of the difference between nodes. In this paper, we propose a novel adaptive multi-view fusion contrastive learning framework, named AdaMCL, for graph collaborative filtering. To exploit the higher-order information more accurately, we propose an adaptive fusion strategy to fuse the embeddings learned from the user-item and user-user graphs. Moreover, we propose a multi-view fusion contrastive learning paradigm to construct effective CL tasks. Besides, to alleviate the noisy information caused by aggregating higher-order neighbors, we propose a layer-level CL task. Extensive experimental results reveal that AdaMCL is effective and outperforms existing collaborative filtering models significantly.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122042589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

DICE: a Dataset of Italian Crime Event news DICE:意大利犯罪事件新闻的数据集

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591904

Giovanni Bonisoli, Maria Pia di Buono, Laura Po, Federica Rollo

{"title":"DICE: a Dataset of Italian Crime Event news","authors":"Giovanni Bonisoli, Maria Pia di Buono, Laura Po, Federica Rollo","doi":"10.1145/3539618.3591904","DOIUrl":"https://doi.org/10.1145/3539618.3591904","url":null,"abstract":"Extracting events from news stories as the aim of several Natural Language Processing (NLP) applications (e.g., question answering, news recommendation, news summarization) is not a trivial task, due to the complexity of natural language and the fact that news reporting is characterized by journalistic style and norms. Those aspects entail scattering an event description over several sentences within one document (or more documents), applying a mechanism of gradual specification of event-related information. This implies a widespread use of co-reference relations among the textual elements, conveying non-linear temporal information. In addition to this, despite the achievement of state-of-the-art results in several tasks, high-quality training datasets for non-English languages are rarely available. This paper presents our preliminary study to develop an annotated Dataset for Italian Crime Event news (DICE). The contribution of the paper are: (1) the creation of a corpus of 10,395 crime news; (2) the annotation schema; (3) a dataset of 10,395 news with automatic annotations; (4) a preliminary manual annotation using the proposed schema of 1000 documents. The first tests on DICE have compared the performance of a manual annotator with that of single-span and multi-span question answering models and shown there is still a gap in the models, especially when dealing with more complex annotation tasks and limited training data. This underscores the importance of investing in the creation of high-quality annotated datasets like DICE, which can provide a solid foundation for training and testing a wide range of NLP models.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125299215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MEME: Multi-Encoder Multi-Expert Framework with Data Augmentation for Video Retrieval 基于数据增强的视频检索多编码器多专家框架

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591726

Seong-Min Kang, Yoon-Sik Cho

{"title":"MEME: Multi-Encoder Multi-Expert Framework with Data Augmentation for Video Retrieval","authors":"Seong-Min Kang, Yoon-Sik Cho","doi":"10.1145/3539618.3591726","DOIUrl":"https://doi.org/10.1145/3539618.3591726","url":null,"abstract":"Text-to-video(T2V) retrieval aims to find relevant videos from text queries. The recently introduced Contrastive Language Image Pretraining (CLIP), a pretrained language-vision model trained on large-scale image and caption pairs, has been extensively studied in the literature for this task. Existing studies on T2V task have aimed to transfer the CLIP knowledge and focus on enhancing retrieval performance through fine-grained representation learning. While fine-grained contrast has achieved some remarkable results, less attention has been paid to coarse-grained contrasts. To this end, we propose a method called Graph Patch Spreading (GPS) to aggregate patches across frames at the coarse-grained level. We apply GPS to our proposed framework called Multi-Encoder Multi-Expert (MEME) framework. Our proposed scheme is general enough to be applied to any existing CLIP-based video-text retrieval models. We demonstrate the effectiveness of our method on existing models over the benchmark datasets MSR-VTT, MSVD, and LSMDC datasets. Our code can be found at https://github.com/kang7734/MEME__.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129833726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VoMBaT: A Tool for Visualising Evaluation Measure Behaviour in High-Recall Search Tasks 一个可视化高召回搜索任务中评价测量行为的工具

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2023-07-18 DOI: 10.1145/3539618.3591802

Wojciech Kusa, Aldo Lipani, Petr Knoth, A. Hanbury

{"title":"VoMBaT: A Tool for Visualising Evaluation Measure Behaviour in High-Recall Search Tasks","authors":"Wojciech Kusa, Aldo Lipani, Petr Knoth, A. Hanbury","doi":"10.1145/3539618.3591802","DOIUrl":"https://doi.org/10.1145/3539618.3591802","url":null,"abstract":"The objective of High-Recall Information Retrieval (HRIR) is to retrieve as many relevant documents as possible for a given search topic. One approach to HRIR is Technology-Assisted Review (TAR), which uses information retrieval and machine learning techniques to aid the review of large document collections. TAR systems are commonly used in legal eDiscovery and systematic literature reviews. Successful TAR systems are able to find the majority of relevant documents using the least number of assessments. Commonly used retrospective evaluation assumes that the system achieves a specific, fixed recall level first, and then measures the precision or work saved (e.g., precision at r% recall). This approach can cause problems related to understanding the behaviour of evaluation measures in a fixed recall setting. It is also problematic when estimating time and money savings during technology-assisted reviews. This paper presents a new visual analytics tool to explore the dynamics of evaluation measures depending on recall level. We implemented 18 evaluation measures based on the confusion matrix terms, both from general IR tasks and specific to TAR. The tool allows for a comparison of the behaviour of these measures in a fixed recall evaluation setting. It can also simulate savings in time and money and a count of manual vs automatic assessments for different datasets depending on the model quality. The tool is open-source, and the demo is available under the following URL: https://vombat.streamlit.app.","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129997463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3