S. Keshvari, Farzan Saeedi, Hadi Sadoghi Yazdi, F. Ensan
{"title":"A Self-Distilled Learning to Rank Model for Ad-hoc Retrieval","authors":"S. Keshvari, Farzan Saeedi, Hadi Sadoghi Yazdi, F. Ensan","doi":"10.1145/3681784","DOIUrl":"https://doi.org/10.1145/3681784","url":null,"abstract":"Learning to rank models are broadly applied in ad-hoc retrieval for scoring and sorting documents based on their relevance to textual queries. The generalizability of the trained model in the learning to rank approach, however, can have an impact on the retrieval performance, particularly when data includes noise and outliers, or is incorrectly collected or measured. In this paper, we introduce a Self-Distilled Learning to Rank (SDLR) framework for ad-hoc retrieval, and analyze its performance over a range of retrieval datasets and also in the presence of features’ noise. SDLR assigns a confidence weight to each training sample, aiming at reducing the impact of noisy and outlier data in the training process. The confidence wight is approximated based on the feature’s distributions derived from the values observed for the features of the documents labeled for a query in a listwise training sample. SDLR includes a distillation process that facilitates passing on the underlying patterns in assigning confidence weights from the teacher model to the student one. We empirically illustrate that SDLR outperforms state-of-the-art learning to rank models in ad-hoc retrieval. We thoroughly investigate the SDLR performance in different settings including when no distillation strategy is applied; when different portion of data is used for training the teacher and the student models, and when both teacher and student models are trained over identical data. We show that SDLR is more effective when training data is split between a teacher and a student model. We also show that SDLR’s performance is robust when data features are noisy.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141804186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Sang, Honghao Li, Yiwen Zhang, Yi Zhang, Yun Yang
{"title":"AdaGIN: Adaptive Graph Interaction Network for Click-Through Rate Prediction","authors":"Lei Sang, Honghao Li, Yiwen Zhang, Yi Zhang, Yun Yang","doi":"10.1145/3681785","DOIUrl":"https://doi.org/10.1145/3681785","url":null,"abstract":"The goal of click-through rate (CTR) prediction in recommender systems is to effectively work with input features. However, existing CTR prediction models face three main issues. First, many models use a basic approach for feature combinations, leading to noise and reduced accuracy. Second, there is no consideration for the varying importance of features in different interaction orders, affecting model performance. Third, current model architectures struggle to capture different interaction signals from various semantic spaces, leading to sub-optimal performance. To address these issues, we propose the Adaptive Graph Interaction Network (AdaGIN) with the Graph Neural Networks-based Feature Interaction Module (GFIM), the Multi-semantic Feature Interaction Module (MFIM), and the Negative Feedback-based Search (NFS) algorithm. GFIM explicitly aggregates information between features and assesses their importance, while MFIM captures information from different semantic spaces. NFS uses negative feedback to optimize model complexity. Experimental results show AdaGIN outperforms existing models on large-scale public benchmark datasets.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141803944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RevGNN: Negative Sampling Enhanced Contrastive Graph Learning for Academic Reviewer Recommendation","authors":"Weibin Liao, Yifan Zhu, Yanyan Li, Qi Zhang, Zhonghong Ou, Xuesong Li","doi":"10.1145/3679200","DOIUrl":"https://doi.org/10.1145/3679200","url":null,"abstract":"Acquiring reviewers for academic submissions is a challenging recommendation scenario. Recent graph learning-driven models have made remarkable progress in the field of recommendation, but their performance in the academic reviewer recommendation task may suffer from a significant false negative issue. This arises from the assumption that unobserved edges represent negative samples. In fact, the mechanism of anonymous review results in inadequate exposure of interactions between reviewers and submissions, leading to a higher number of unobserved interactions compared to those caused by reviewers declining to participate. Therefore, investigating how to better comprehend the negative labeling of unobserved interactions in academic reviewer recommendations is a significant challenge. This study aims to tackle the ambiguous nature of unobserved interactions in academic reviewer recommendations. Specifically, we propose an unsupervised Pseudo Neg-Label strategy to enhance graph contrastive learning (GCL) for recommending reviewers for academic submissions, which we call RevGNN. RevGNN utilizes a two-stage encoder structure that encodes both scientific knowledge and behavior using Pseudo Neg-Label to approximate review preference. Extensive experiments on three real-world datasets demonstrate that RevGNN outperforms all baselines across four metrics. Additionally, detailed further analyses confirm the effectiveness of each component in RevGNN.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141815618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual Contrastive Learning for Cross-domain Named Entity Recognition","authors":"Jingyun Xu, Junnan Yu, Yi Cai, Tat-Seng Chua","doi":"10.1145/3678879","DOIUrl":"https://doi.org/10.1145/3678879","url":null,"abstract":"\u0000 Benefiting many information retrieval applications, named entity recognition (NER) has shown impressive progress. Recently, there has been a growing trend to decompose complex NER tasks into two subtasks (\u0000 \u0000 (e.g.,)\u0000 \u0000 entity span detection (ESD) and entity type classification (ETC), to achieve better performance. Despite the remarkable success, from the perspective of representation, existing methods do not explicitly distinguish non-entities and entities, which may lead to entity span detection errors. Meanwhile, they do not explicitly distinguish entities with different entity types, which may lead to entity type misclassification. As such, the limited representation abilities may challenge some competitive NER methods, leading to unsatisfactory performance, especially in the low-resource setting (\u0000 \u0000 (e.g.,)\u0000 \u0000 cross-domain NER). In light of these challenges, we propose to utilize contrastive learning to refine the original chaotic representations and learn the generalized representations for cross-domain NER. In particular, this paper proposes a dual contrastive learning model (Dual-CL), which respectively utilizes a token-level contrastive learning module and a sentence-level contrastive learning module to enhance ESD, ETC for cross-domain NER. Empirical results on 10 domain pairs under two different settings show that Dual-CL achieves better performances than compared baselines in terms of several standard metrics. Moreover, we conduct detailed analyses to are presented to better understand each component’s effectiveness.\u0000","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141820691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parastoo Jafarzadeh, F. Ensan, Mahdiyar Ali Akbar Alavi, Fattane Zarrinkalam
{"title":"A Knowledge Graph Embedding Model for Answering Factoid Entity Questions","authors":"Parastoo Jafarzadeh, F. Ensan, Mahdiyar Ali Akbar Alavi, Fattane Zarrinkalam","doi":"10.1145/3678003","DOIUrl":"https://doi.org/10.1145/3678003","url":null,"abstract":"Factoid entity questions (FEQ), which seek answers in the form of a single entity from knowledge sources such as DBpedia and Wikidata, constitute a substantial portion of user queries in search engines. This paper introduces the Knowledge Graph Embedding model for Factoid Entity Question answering (KGE-FEQ). Leveraging a textual knowledge graph derived from extensive text collections, KGE-FEQ encodes textual relationships between entities. The model employs a two-step process: (1) Triple Retrieval, where relevant triples are retrieved from the textual knowledge graph based on semantic similarities to the question, and (2) Answer Selection, where a knowledge graph embedding approach is utilized for answering the question. This involves positioning the embedding for the answer entity close to the embedding of the question entity, incorporating a vector representing the question and textual relations between entities. Extensive experiments evaluate the performance of the proposed approach, comparing KGE-FEQ to state-of-the-art baselines in factoid entity question answering and the most advanced open-domain question answering techniques applied to FEQs. The results show that KGE-FEQ outperforms existing methods across different datasets. Ablation studies highlights the effectiveness of KGE-FEQ when both the question and textual relations between entities are considered for answering questions.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141646748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge-Enhanced Conversational Recommendation via Transformer-based Sequential Modelling","authors":"Jie Zou, Aixin Sun, Cheng Long, E. Kanoulas","doi":"10.1145/3677376","DOIUrl":"https://doi.org/10.1145/3677376","url":null,"abstract":"In Conversational Recommender Systems (CRSs), conversations usually involve a set of items and item-related entities or attributes, e.g., director is a related entity of a movie. These items and item-related entities are often mentioned along the development of a dialog, leading to potential sequential dependencies among them. However, most of existing CRSs neglect these potential sequential dependencies. In this paper, we first propose a Transformer-based sequential conversational recommendation method, named TSCR, to model the sequential dependencies in the conversations to improve CRS. In TSCR, we represent conversations by items and the item-related entities, and construct user sequences to discover user preferences by considering both the mentioned items and item-related entities. Based on the constructed sequences, we deploy a Cloze task to predict the recommended items along a sequence. Meanwhile, in certain domains, knowledge graphs formed by the items and their related entities are readily available, which provide various different kinds of associations among them. Given that TSCR does not benefit from such knowledge graphs, we then propose a knowledge graph enhanced version of TSCR, called TSCRKG. In specific, we leverage the knowledge graph to offline initialize our model TSCRKG, and augment the user sequence of conversations (i.e., sequence of the mentioned items and item-related entities in the conversation) with multi-hop paths in the knowledge graph. Experimental results demonstrate that our TSCR model significantly outperforms state-of-the-art baselines, and the enhanced version TSCRKG further improves recommendation performance on top of TSCR.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141652814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Elastic Language Models","authors":"Chen Zhang, Benyou Wang, Dawei Song","doi":"10.1145/3677375","DOIUrl":"https://doi.org/10.1145/3677375","url":null,"abstract":"\u0000 Large-scale pretrained language models have achieved compelling performance in a wide range of language understanding and information retrieval tasks. While their large scales ensure capacity, they also hinder deployment. Knowledge distillation offers an opportunity to compress a large language model to a small one, in order to reach a reasonable latency-performance tradeoff. However, for scenarios where the number of requests (e.g., queries submitted to a search engine) is highly variant, the static tradeoff attained by the compressed language model might not always fit. Once a model is assigned with a static tradeoff, it could be inadequate in that the latency is too high when the number of requests is large, or the performance is too low when the number of requests is small. To this end, we propose an elastic language model (\u0000 ElasticLM\u0000 ) that elastically adjusts the tradeoff according to the request stream. The basic idea is to introduce a compute elasticity to the compressed language model, so that the tradeoff could vary on-the-fly along a scalable and controllable compute. Specifically, we impose an elastic structure to equip\u0000 ElasticLM\u0000 with compute elasticity and design an elastic optimization method to learn\u0000 ElasticLM\u0000 under compute elasticity. To serve\u0000 ElasticLM\u0000 , we apply an elastic schedule. Considering the specificity of information retrieval, we adapt\u0000 ElasticLM\u0000 to dense retrieval and reranking, and present an\u0000 ElasticDenser\u0000 and an\u0000 ElasticRanker\u0000 respectively. Offline evaluation is conducted on a language understanding benchmark GLUE, and several information retrieval tasks including Natural Question, Trivia QA and MS MARCO. The results show that\u0000 ElasticLM\u0000 along with\u0000 ElasticDenser\u0000 and\u0000 ElasticRanker\u0000 can perform correctly and competitively compared with an array of static baselines. Furthermore, an online simulation with concurrency is also carried out. The results demonstrate that\u0000 ElasticLM\u0000 can provide elastic tradeoffs with respect to varying request stream.\u0000","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141653049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lixiang Xu, Yusheng Liu, Tong Xu, Enhong Chen, Y. Tang
{"title":"Graph Augmentation Empowered Contrastive Learning for Recommendation","authors":"Lixiang Xu, Yusheng Liu, Tong Xu, Enhong Chen, Y. Tang","doi":"10.1145/3677377","DOIUrl":"https://doi.org/10.1145/3677377","url":null,"abstract":"\u0000 The application of contrastive learning (CL) to collaborative filtering (CF) in recommender systems has achieved remarkable success. CL-based recommendation models mainly focus on creating multiple augmented views by employing different graph augmentation methods and utilizing these views for self-supervised learning. However, current CL methods for recommender systems usually struggle to fully address the problem of noisy data. To address this problem, we propose the\u0000 G\u0000 raph\u0000 A\u0000 ugmentation\u0000 E\u0000 mpowered\u0000 C\u0000 ontrastive\u0000 L\u0000 earning\u0000 (GAECL)\u0000 for recommendation framework, which uses graph augmentation based on topological and semantic dual adaptation and global co-modeling via structural optimization to co-create contrasting views for better augmentation of the CF paradigm. Specifically, we strictly filter out unimportant topologies by reconstructing the adjacency matrix and mask unimportant attributes in nodes according to the PageRank centrality principle to generate an augmented view that filters out noisy data. Additionally, GAECL achieves global collaborative modeling through structural optimization and generates another augmented view based on the PageRank centrality principle. This helps to filter the noisy data while preserving the original semantics of the data for more effective data augmentation. Extensive experiments are conducted on five datasets to demonstrate the superior performance of our model over various recommendation models.\u0000","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141653160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siyi Lin, Sheng Zhou, Jiawei Chen, Yan Feng, Qihao Shi, Chun Chen, Ying Li, Can Wang
{"title":"ReCRec: Reasoning the Causes of Implicit Feedback for Debiased Recommendation","authors":"Siyi Lin, Sheng Zhou, Jiawei Chen, Yan Feng, Qihao Shi, Chun Chen, Ying Li, Can Wang","doi":"10.1145/3672275","DOIUrl":"https://doi.org/10.1145/3672275","url":null,"abstract":"\u0000 Implicit feedback (\u0000 e.g\u0000 ., user clicks) is widely used in building recommender systems (RS). However, the inherent notorious\u0000 exposure bias\u0000 significantly affects recommendation performance. Exposure bias refers a phenomenon that implicit feedback is influenced by user exposure, and does not precisely reflect user preference. Current methods for addressing exposure bias primarily reduce confidence in unclicked data, employ exposure models, or leverage propensity scores. Regrettably, these approaches often lead to biased estimations or elevated model variance, yielding sub-optimal results.\u0000 \u0000 \u0000 To overcome these limitations, we propose a new method\u0000 ReCRec\u0000 that\u0000 Re\u0000 asons the\u0000 C\u0000 auses behind the implicit feedback for debiased\u0000 Rec\u0000 ommendation. ReCRec identifies three scenarios behind unclicked data —\u0000 i.e.\u0000 , unexposed, dislike or a combination of both. A reasoning module is employed to infer the category to which each instance pertains. Consequently, the model is capable of extracting reliable positive and negative signals from unclicked data, thereby facilitating more accurate learning of user preferences. We also conduct thorough theoretical analyses to demonstrate the debiased nature and low variance of ReCRec. Extensive experiments on both semi-synthetic and real-world datasets validate its superiority over state-of-the-art methods.\u0000","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141669524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TriMLP: A Foundational MLP-like Architecture for Sequential Recommendation","authors":"Yiheng Jiang, Yuanbo Xu, Yongjian Yang, Funing Yang, Pengyang Wang, Chaozhuo Li, Fuzhen Zhuang, Hui Xiong","doi":"10.1145/3670995","DOIUrl":"https://doi.org/10.1145/3670995","url":null,"abstract":"In this work, we present TriMLP as a foundational MLP-like architecture for the sequential recommendation, simultaneously achieving computational efficiency and promising performance. First, we empirically study the incompatibility between existing purely MLP-based models and sequential recommendation, that the inherent fully-connective structure endows historical user-item interactions (referred as tokens) with unrestricted communications and overlooks the essential chronological order in sequences. Then, we propose the MLP-based Triangular Mixer to establish ordered contact among tokens and excavate the primary sequential modeling capability under the standard auto-regressive training fashion. It contains (i) a global mixing layer that drops the lower-triangle neurons in MLP to block the anti-chronological connections from future tokens and (ii) a local mixing layer that further disables specific upper-triangle neurons to split the sequence as multiple independent sessions. The mixer serially alternates these two layers to support fine-grained preferences modeling, where the global one focuses on the long-range dependency in the whole sequence, and the local one calls for the short-term patterns in sessions. Experimental results on 12 datasets of different scales from 4 benchmarks elucidate that TriMLP consistently attains favorable accuracy/efficiency trade-off over all validated datasets, where the average performance boost against several state-of-the-art baselines achieves up to 14.88%, and the maximum reduction of inference time reaches 23.73%. The intriguing properties render TriMLP a strong contender to the well-established RNN-, CNN- and Transformer-based sequential recommenders. Code is available at https://github.com/jiangyiheng1/TriMLP.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141361602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}