Qi Liu, Gang Guo, Jiaxin Mao, Zhicheng Dou, Ji-Rong Wen, Hao Jiang, Xinyu Zhang, Zhao Cao
{"title":"An Analysis on Matching Mechanisms and Token Pruning for Late-interaction Models","authors":"Qi Liu, Gang Guo, Jiaxin Mao, Zhicheng Dou, Ji-Rong Wen, Hao Jiang, Xinyu Zhang, Zhao Cao","doi":"10.1145/3639818","DOIUrl":"https://doi.org/10.1145/3639818","url":null,"abstract":"<p>With the development of pre-trained language models, the dense retrieval models have become promising alternatives to the traditional retrieval models that rely on exact match and sparse bag-of-words representations. Different from most dense retrieval models using a bi-encoder to encode each query or document into a dense vector, the recently proposed late-interaction multi-vector models (i.e., ColBERT and COIL) achieve state-of-the-art retrieval effectiveness by using all token embeddings to represent documents and queries and modeling their relevance with a sum-of-max operation. However, these fine-grained representations may cause unacceptable storage overhead for practical search systems. In this study, we systematically analyze the matching mechanism of these late-interaction models and show that the sum-of-max operation heavily relies on the co-occurrence signals and some important words in the document. Based on these findings, we then propose several simple document pruning methods to reduce the storage overhead and compare the effectiveness of different pruning methods on different late-interaction models. We also leverage query pruning methods to further reduce the retrieval latency. We conduct extensive experiments on both in-domain and out-domain datasets and show that some of the used pruning methods can significantly improve the efficiency of these late-interaction models without substantially hurting their retrieval effectiveness.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"67 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139648700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Counterfactual Explanation for Fairness in Recommendation","authors":"Xiangmeng Wang, Qian Li, Dianer Yu, Qing Li, Guandong Xu","doi":"10.1145/3643670","DOIUrl":"https://doi.org/10.1145/3643670","url":null,"abstract":"<p>Fairness-aware recommendation alleviates discrimination issues to build trustworthy recommendation systems. Explaining the causes of unfair recommendations is critical, as it promotes fairness diagnostics, and thus secures users’ trust in recommendation models. Existing fairness explanation methods suffer high computation burdens due to the large-scale search space and the greedy nature of the explanation search process. Besides, they perform feature-level optimizations with continuous values, which are not applicable to discrete attributes such as gender and age. In this work, we adopt counterfactual explanations from causal inference and propose to generate attribute-level counterfactual explanations, adapting to discrete attributes in recommendation models. We use real-world attributes from Heterogeneous Information Networks (HINs) to empower counterfactual reasoning on discrete attributes. We propose a <i>Counterfactual Explanation for Fairness (CFairER)</i> that generates attribute-level counterfactual explanations from HINs for item exposure fairness. Our <i>CFairER</i> conducts off-policy reinforcement learning to seek high-quality counterfactual explanations, with attentive action pruning reducing the search space of candidate counterfactuals. The counterfactual explanations help to provide rational and proximate explanations for model fairness, while the attentive action pruning narrows the search space of attributes. Extensive experiments demonstrate our proposed model can generate faithful explanations while maintaining favorable recommendation performance.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"36 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139579917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MCN4Rec: Multi-Level Collaborative Neural Network for Next Location Recommendation","authors":"Shuzhe Li, Wei Chen, Bin Wang, Chao Huang, Yanwei Yu, Junyu Dong","doi":"10.1145/3643669","DOIUrl":"https://doi.org/10.1145/3643669","url":null,"abstract":"<p>Next location recommendation plays an important role in various location-based services, yielding great value for both users and service providers. Existing methods usually model temporal dependencies with explicit time intervals or learn representation from customized point of interest (POI) graphs with rich context information to capture the sequential patterns among POIs. However, this problem is perceptibly complex because various factors, <i>e</i>.<i>g</i>., users’ preferences, spatial locations, time contexts, activity category semantics, and temporal relations, need to be considered together, while most studies lack sufficient consideration of the collaborative signals. Toward this goal, we propose a novel <underline>M</underline>ulti-Level <underline>C</underline>ollaborative Neural <underline>N</underline>etwork for next location <underline>Rec</underline>ommendation (MCN4Rec). Specifically, we design a multi-level view representation learning with level-wise contrastive learning to collaboratively learn representation from local and global perspectives to capture complex heterogeneous relationships among user, POI, time, and activity categories. Then a causal encoder-decoder is applied to the learned representations of check-in sequences to recommend the next location. Extensive experiments on four real-world check-in mobility datasets demonstrate that our model significantly outperforms the existing state-of-the-art baselines for the next location recommendation. Ablation study further validates the benefits of the collaboration of the designed sub-modules. The source code is available at https://github.com/quai-mengxiang/MCN4Rec.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"7 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139580062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Can Perturbations Help Reduce Investment Risks? Risk-Aware Stock Recommendation via Split Variational Adversarial Training","authors":"Jiezhu Cheng, Kaizhu Huang, Zibin Zheng","doi":"10.1145/3643131","DOIUrl":"https://doi.org/10.1145/3643131","url":null,"abstract":"<p>In the stock market, a successful investment requires a good balance between profits and risks. Based on the <i>learning to rank</i> paradigm, stock recommendation has been widely studied in quantitative finance to recommend stocks with higher return ratios for investors. Despite the efforts to make profits, many existing recommendation approaches still have some limitations in risk control, which may lead to intolerable paper losses in practical stock investing. To effectively reduce risks, we draw inspiration from adversarial learning and propose a novel <i>Split Variational Adversarial Training</i> (SVAT) method for risk-aware stock recommendation. Essentially, SVAT encourages the stock model to be sensitive to adversarial perturbations of risky stock examples and enhances the model’s risk awareness by learning from perturbations. To generate representative adversarial examples as risk indicators, we devise a variational perturbation generator to model diverse risk factors. Particularly, the variational architecture enables our method to provide a rough risk quantification for investors, showing an additional advantage of interpretability. Experiments on several real-world stock market datasets demonstrate the superiority of our SVAT method. By lowering the volatility of the stock recommendation model, SVAT effectively reduces investment risks and outperforms state-of-the-art baselines by more than (30% ) in terms of risk-adjusted profits. All the experimental data and source code are available at https://drive.google.com/drive/folders/14AdM7WENEvIp5x5bV3zV_i4Aev21C9g6?usp=sharing.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"34 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139554431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tagging Items with Emerging Tags: A Neural Topic Model based Few-Shot Learning Approach","authors":"Shangkun Che, Hongyan Liu, Shen Liu","doi":"10.1145/3641859","DOIUrl":"https://doi.org/10.1145/3641859","url":null,"abstract":"<p>The tagging system has become a primary tool to organize information resources on the Internet, which benefits both users and the platforms. To build a successful tagging system, automatic tagging methods are desired. With the development of society, new tags keep emerging. The problem of tagging items with emerging tags is an open challenge for automatic tagging system, and it has not been well studied in the literature. We define this problem as a tag-centered cold-start problem in this study and propose a novel neural topic model based few-shot learning method named NTFSL to solve the problem. In our proposed method, we innovatively fuse the topic modeling task with the few-shot learning task, endowing the model with the capability to infer effective topics to solve the tag-centered cold-start problem with the property of interpretability. Meanwhile, we propose a novel neural topic model for the topic modeling task to improve the quality of inferred topics, which helps enhance the tagging performance. Furthermore, we develop a novel inference method based on the variational auto-encoding framework for model inference. We conducted extensive experiments on two real-world datasets and the results demonstrate the superior performance of our proposed model compared with state-of-the-art machine learning methods. Case studies also show the interpretability of the model.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"7 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139554081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John P. Lalor, Ahmed Abbasi, Kezia Oketch, Yi Yang, Nicole Forsgren
{"title":"Should Fairness be a Metric or a Model? A Model-based Framework for Assessing Bias in Machine Learning Pipelines","authors":"John P. Lalor, Ahmed Abbasi, Kezia Oketch, Yi Yang, Nicole Forsgren","doi":"10.1145/3641276","DOIUrl":"https://doi.org/10.1145/3641276","url":null,"abstract":"<p>Fairness measurement is crucial for assessing algorithmic bias in various types of machine learning (ML) models, including ones used for search relevance, recommendation, personalization, talent analytics, and natural language processing. However, the fairness measurement paradigm is currently dominated by fairness metrics that examine disparities in allocation and/or prediction error as univariate key performance indicators (KPIs) for a protected attribute or group. Although important and effective in assessing ML bias in certain contexts such as recidivism, existing metrics don’t work well in many real-world applications of ML characterized by imperfect models applied to an array of instances encompassing a multivariate mixture of protected attributes, that are part of a broader process pipeline. Consequently, the upstream representational harm quantified by existing metrics based on how the model represents protected groups doesn’t necessarily relate to allocational harm in the application of such models in downstream policy/decision contexts. We propose FAIR-Frame, a model-based framework for parsimoniously modeling fairness across multiple protected attributes in regard to the representational and allocational harm associated with the upstream design/development and downstream usage of ML models. We evaluate the efficacy of our proposed framework on two testbeds pertaining to text classification using pretrained language models. The upstream testbeds encompass over fifty thousand documents associated with twenty-eight thousand users, seven protected attributes and five different classification tasks. The downstream testbeds span three policy outcomes and over 5.41 million total observations. Results in comparison with several existing metrics show that the upstream representational harm measures produced by FAIR-Frame and other metrics are significantly different from one another, and that FAIR-Frame’s representational fairness measures have the highest percentage alignment and lowest error with allocational harm observed in downstream applications. Our findings have important implications for various ML contexts, including information retrieval, user modeling, digital platforms, and text classification, where responsible and trustworthy AI are becoming an imperative.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"20 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139554160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MultiCBR: Multi-view Contrastive Learning for Bundle Recommendation","authors":"Yunshan Ma, Yingzhi He, Xiang Wang, Yinwei Wei, Xiaoyu Du, Yuyangzi Fu, Tat-Seng Chua","doi":"10.1145/3640810","DOIUrl":"https://doi.org/10.1145/3640810","url":null,"abstract":"<p>Bundle recommendation seeks to recommend a bundle of related items to users to improve both user experience and the profits of platform. Existing bundle recommendation models have progressed from capturing only user-bundle interactions to the modeling of multiple relations among users, bundles and items. CrossCBR, in particular, incorporates cross-view contrastive learning into a two-view preference learning framework, significantly improving SOTA performance. It does, however, have two limitations: 1) the two-view formulation does not fully exploit all the heterogeneous relations among users, bundles and items; and 2) the ”early contrast and late fusion” framework is less effective in capturing user preference and difficult to generalize to multiple views. </p><p>In this paper, we present MultiCBR, a novel <b>Multi</b>-view <b>C</b>ontrastive learning framework for <b>B</b>undle <b>R</b>ecommendation. First, we devise a multi-view representation learning framework capable of capturing all the user-bundle, user-item and bundle-item relations, especially better utilizing the bundle-item affiliations to enhance sparse bundles’ representations. Second, we innovatively adopt an ”early fusion and late contrast” design that first fuses the multi-view representations before performing self-supervised contrastive learning. In comparison to existing approaches, our framework reverses the order of fusion and contrast, introducing the following advantages: 1) our framework is capable of modeling both cross-view and ego-view preferences, allowing us to achieve enhanced user preference modeling; and 2) instead of requiring quadratic number of cross-view contrastive losses, we only require two self-supervised contrastive losses, resulting in minimal extra costs. Experimental results on three public datasets indicate that our method outperforms SOTA methods. The code and dataset can be found in the github repo https://github.com/HappyPointer/MultiCBR.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"228 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139554161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Liu, Lei Guo, Lei Zhu, Yongqiang Jiang, Min Gao, Hongzhi Yin
{"title":"MCRPL: A Pretrain, Prompt & Fine-tune Paradigm for Non-overlapping Many-to-one Cross-domain Recommendation","authors":"Hao Liu, Lei Guo, Lei Zhu, Yongqiang Jiang, Min Gao, Hongzhi Yin","doi":"10.1145/3641860","DOIUrl":"https://doi.org/10.1145/3641860","url":null,"abstract":"<p>Cross-domain Recommendation (CR) is the task that tends to improve the recommendations in the sparse target domain by leveraging the information from other rich domains. Existing methods of cross-domain recommendation mainly focus on overlapping scenarios by assuming users are totally or partially overlapped, which are taken as bridges to connect different domains. However, this assumption does not always hold since it is illegal to leak users’ identity information to other domains. Conducting Non-overlapping MCR (NMCR) is challenging since 1) The absence of overlapping information prevents us from directly aligning different domains, and this situation may get worse in the MCR scenario. 2) The distribution between source and target domains makes it difficult for us to learn common information across domains. To overcome the above challenges, we focus on NMCR, and devise MCRPL as our solution. To address Challenge 1, we first learn shared domain-agnostic and domain-dependent prompts, and pre-train them in the pre-training stage. To address Challenge 2, we further update the domain-dependent prompts with other parameters kept fixed to transfer the domain knowledge to the target domain. We conduct experiments on five real-world domains, and the results show the advance of our MCRPL method compared with several recent SOTA baselines. Moreover, Our source codes have been publicly released<sup>1</sup>.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"7 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139516392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Representations of Information Needs from Digital Activity Context","authors":"Tung Vuong, Tuukka Ruotsalo","doi":"10.1145/3639819","DOIUrl":"https://doi.org/10.1145/3639819","url":null,"abstract":"<p>Information retrieval systems often consider search-session and immediately preceding web-browsing history as the context for predicting users’ present information needs. However, such context is only available when a user’s information needs originate from web context or when users have issued preceding queries in the search session. Here, we study the effect of more extensive context information recorded from users’ everyday digital activities by monitoring all information interacted with and communicated using personal computers. Twenty individuals were recruited for 14 days of 24/7 continuous monitoring of their digital activities, including screen contents, clicks, and operating system logs on Web and non-Web applications. Using this data, a transformer architecture is applied to model the digital activity context and predict representations of personalized information needs. Subsequently, the representations of information needs are used for query prediction, query auto-completion, selected search result prediction, and Web search re-ranking. The predictions of the models are evaluated against the ground truth data obtained from the activity recordings. The results reveal that the models accurately predict representations of information needs improving over the conventional search session and web-browsing contexts. The results indicate that the present practice for utilizing users’ contextual information is limited and can be significantly extended to achieve improved search interaction support and performance.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"1 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139476820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intent-oriented Dynamic Interest Modeling for Personalized Web Search","authors":"Yutong Bai, Yujia Zhou, Zhicheng Dou, Ji-Rong Wen","doi":"10.1145/3639817","DOIUrl":"https://doi.org/10.1145/3639817","url":null,"abstract":"<p>Given a user, a personalized search model relies on her historical behaviors, such as issued queries and their clicked documents, to generate an interest profile and personalize search results accordingly. In interest profiling, most existing personalized search approaches use “static” document representations as the inputs, which do not change with the current search. However, a document is usually long and contains multiple pieces of information, a static fix-length document vector is usually insufficient to represent the important information related to the original query or the current query, and makes the profile noisy and ambiguous. To tackle this problem, we propose building dynamic and intent-oriented document representations which highlight important parts of a document rather than simply encode the entire text. Specifically, we divide each document into multiple passages, and then separately use the original query and the current query to interact with the passages. Thereafter we generate two “dynamic” document representations containing the key information around the historical and the current user intent, respectively. We then profile interest by capturing the interactions between these document representations, the historical queries, and the current query. Experimental results on a real-world search log dataset demonstrate that our model significantly outperforms state-of-the-art personalization methods.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"126 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139411212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}