{"title":"CADRL: Category-aware Dual-agent Reinforcement Learning for Explainable Recommendations over Knowledge Graphs","authors":"Shangfei Zheng, Hongzhi Yin, Tong Chen, Xiangjie Kong, Jian Hou, Pengpeng Zhao","doi":"arxiv-2408.03166","DOIUrl":"https://doi.org/arxiv-2408.03166","url":null,"abstract":"Knowledge graphs (KGs) have been widely adopted to mitigate data sparsity and\u0000address cold-start issues in recommender systems. While existing KGs-based\u0000recommendation methods can predict user preferences and demands, they fall\u0000short in generating explicit recommendation paths and lack explainability. As a\u0000step beyond the above methods, recent advancements utilize reinforcement\u0000learning (RL) to find suitable items for a given user via explainable\u0000recommendation paths. However, the performance of these solutions is still\u0000limited by the following two points. (1) Lack of ability to capture contextual\u0000dependencies from neighboring information. (2) The excessive reliance on short\u0000recommendation paths due to efficiency concerns. To surmount these challenges,\u0000we propose a category-aware dual-agent reinforcement learning (CADRL) model for\u0000explainable recommendations over KGs. Specifically, our model comprises two\u0000components: (1) a category-aware gated graph neural network that jointly\u0000captures context-aware item representations from neighboring entities and\u0000categories, and (2) a dual-agent RL framework where two agents efficiently\u0000traverse long paths to search for suitable items. Finally, experimental results\u0000show that CADRL outperforms state-of-the-art models in terms of both\u0000effectiveness and efficiency on large-scale datasets.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141969715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling User Intent Beyond Trigger: Incorporating Uncertainty for Trigger-Induced Recommendation","authors":"Jianxing Ma, Zhibo Xiao, Luwei Yang, Hansheng Xue, Xuanzhou Liu, Wen Jiang, Wei Ning, Guannan Zhang","doi":"arxiv-2408.03091","DOIUrl":"https://doi.org/arxiv-2408.03091","url":null,"abstract":"To cater to users' desire for an immersive browsing experience, numerous\u0000e-commerce platforms provide various recommendation scenarios, with a focus on\u0000Trigger-Induced Recommendation (TIR) tasks. However, the majority of current\u0000TIR methods heavily rely on the trigger item to understand user intent, lacking\u0000a higher-level exploration and exploitation of user intent (e.g., popular items\u0000and complementary items), which may result in an overly convergent\u0000understanding of users' short-term intent and can be detrimental to users'\u0000long-term purchasing experiences. Moreover, users' short-term intent shows\u0000uncertainty and is affected by various factors such as browsing context and\u0000historical behaviors, which poses challenges to user intent modeling. To\u0000address these challenges, we propose a novel model called Deep Uncertainty\u0000Intent Network (DUIN), comprising three essential modules: i) Explicit Intent\u0000Exploit Module extracting explicit user intent using the contrastive learning\u0000paradigm; ii) Latent Intent Explore Module exploring latent user intent by\u0000leveraging the multi-view relationships between items; iii) Intent Uncertainty\u0000Measurement Module offering a distributional estimation and capturing the\u0000uncertainty associated with user intent. Experiments on three real-world\u0000datasets demonstrate the superior performance of DUIN compared to existing\u0000baselines. Notably, DUIN has been deployed across all TIR scenarios in our\u0000e-commerce platform, with online A/B testing results conclusively validating\u0000its superiority.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Real-Time Adaptive Multi-Stream GPU System for Online Approximate Nearest Neighborhood Search","authors":"Yiping Sun, Yang Shi, Jiaolong Du","doi":"arxiv-2408.02937","DOIUrl":"https://doi.org/arxiv-2408.02937","url":null,"abstract":"In recent years, Approximate Nearest Neighbor Search (ANNS) has played a\u0000pivotal role in modern search and recommendation systems, especially in\u0000emerging LLM applications like Retrieval-Augmented Generation. There is a\u0000growing exploration into harnessing the parallel computing capabilities of GPUs\u0000to meet the substantial demands of ANNS. However, existing systems primarily\u0000focus on offline scenarios, overlooking the distinct requirements of online\u0000applications that necessitate real-time insertion of new vectors. This\u0000limitation renders such systems inefficient for real-world scenarios. Moreover,\u0000previous architectures struggled to effectively support real-time insertion due\u0000to their reliance on serial execution streams. In this paper, we introduce a\u0000novel Real-Time Adaptive Multi-Stream GPU ANNS System (RTAMS-GANNS). Our\u0000architecture achieves its objectives through three key advancements: 1) We\u0000initially examined the real-time insertion mechanisms in existing GPU ANNS\u0000systems and discovered their reliance on repetitive copying and memory\u0000allocation, which significantly hinders real-time effectiveness on GPUs. As a\u0000solution, we introduce a dynamic vector insertion algorithm based on memory\u0000blocks, which includes in-place rearrangement. 2) To enable real-time vector\u0000insertion in parallel, we introduce a multi-stream parallel execution mode,\u0000which differs from existing systems that operate serially within a single\u0000stream. Our system utilizes a dynamic resource pool, allowing multiple streams\u0000to execute concurrently without additional execution blocking. 3) Through\u0000extensive experiments and comparisons, our approach effectively handles varying\u0000QPS levels across different datasets, reducing latency by up to 40%-80%. The\u0000proposed system has also been deployed in real-world industrial search and\u0000recommendation systems, serving hundreds of millions of users daily, and has\u0000achieved good results.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen
{"title":"ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning","authors":"Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen","doi":"arxiv-2408.03402","DOIUrl":"https://doi.org/arxiv-2408.03402","url":null,"abstract":"Large Language Models (LLMs) excel in various natural language processing\u0000tasks, but leveraging them for dense passage embedding remains challenging.\u0000This is due to their causal attention mechanism and the misalignment between\u0000their pre-training objectives and the text ranking tasks. Despite some recent\u0000efforts to address these issues, existing frameworks for LLM-based text\u0000embeddings have been limited by their support for only a limited range of LLM\u0000architectures and fine-tuning strategies, limiting their practical application\u0000and versatility. In this work, we introduce the Unified framework for Large\u0000Language Model Embedding (ULLME), a flexible, plug-and-play implementation that\u0000enables bidirectional attention across various LLMs and supports a range of\u0000fine-tuning strategies. We also propose Generation-augmented Representation\u0000Learning (GRL), a novel fine-tuning method to boost LLMs for text embedding\u0000tasks. GRL enforces consistency between representation-based and\u0000generation-based relevance scores, leveraging LLMs' powerful generative\u0000abilities for learning passage embeddings. To showcase our framework's\u0000flexibility and effectiveness, we release three pre-trained models from ULLME\u0000with different backbone architectures, ranging from 1.5B to 8B parameters, all\u0000of which demonstrate strong performance on the Massive Text Embedding\u0000Benchmark. Our framework is publicly available at:\u0000https://github.com/nlp-uoregon/ullme. A demo video for ULLME can also be found\u0000at https://rb.gy/ws1ile.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Aditya Vempaty, Pawan Goyal, Niloy Ganguly, Prasenjit Dey, Ravi Kokku
{"title":"Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization","authors":"Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Aditya Vempaty, Pawan Goyal, Niloy Ganguly, Prasenjit Dey, Ravi Kokku","doi":"arxiv-2408.02584","DOIUrl":"https://doi.org/arxiv-2408.02584","url":null,"abstract":"The ever-increasing volume of digital information necessitates efficient\u0000methods for users to extract key insights from lengthy documents. Aspect-based\u0000summarization offers a targeted approach, generating summaries focused on\u0000specific aspects within a document. Despite advancements in aspect-based\u0000summarization research, there is a continuous quest for improved model\u0000performance. Given that large language models (LLMs) have demonstrated the\u0000potential to revolutionize diverse tasks within natural language processing,\u0000particularly in the problem of summarization, this paper explores the potential\u0000of fine-tuning LLMs for the aspect-based summarization task. We evaluate the\u0000impact of fine-tuning open-source foundation LLMs, including Llama2, Mistral,\u0000Gemma and Aya, on a publicly available domain-specific aspect based summary\u0000dataset. We hypothesize that this approach will enable these models to\u0000effectively identify and extract aspect-related information, leading to\u0000superior quality aspect-based summaries compared to the state-of-the-art. We\u0000establish a comprehensive evaluation framework to compare the performance of\u0000fine-tuned LLMs against competing aspect-based summarization methods and\u0000vanilla counterparts of the fine-tuned LLMs. Our work contributes to the field\u0000of aspect-based summarization by demonstrating the efficacy of fine-tuning LLMs\u0000for generating high-quality aspect-based summaries. Furthermore, it opens doors\u0000for further exploration of using LLMs for targeted information extraction tasks\u0000across various NLP domains.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141969711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feedback Reciprocal Graph Collaborative Filtering","authors":"Weijun Chen, Yuanchen Bei, Qijie Shen, Hao Chen, Xiao Huang, Feiran Huang","doi":"arxiv-2408.02404","DOIUrl":"https://doi.org/arxiv-2408.02404","url":null,"abstract":"Collaborative filtering on user-item interaction graphs has achieved success\u0000in the industrial recommendation. However, recommending users' truly fascinated\u0000items poses a seesaw dilemma for collaborative filtering models learned from\u0000the interaction graph. On the one hand, not all items that users interact with\u0000are equally appealing. Some items are genuinely fascinating to users, while\u0000others are unfascinated. Training graph collaborative filtering models in the\u0000absence of distinction between them can lead to the recommendation of\u0000unfascinating items to users. On the other hand, disregarding the interacted\u0000but unfascinating items during graph collaborative filtering will result in an\u0000incomplete representation of users' interaction intent, leading to a decline in\u0000the model's recommendation capabilities. To address this seesaw problem, we\u0000propose Feedback Reciprocal Graph Collaborative Filtering (FRGCF), which\u0000emphasizes the recommendation of fascinating items while attenuating the\u0000recommendation of unfascinating items. Specifically, FRGCF first partitions the\u0000entire interaction graph into the Interacted & Fascinated (I&F) graph and the\u0000Interacted & Unfascinated (I&U) graph based on the user feedback. Then, FRGCF\u0000introduces separate collaborative filtering on the I&F graph and the I&U graph\u0000with feedback-reciprocal contrastive learning and macro-level feedback\u0000modeling. This enables the I&F graph recommender to learn multi-grained\u0000interaction characteristics from the I&U graph without being misdirected by it.\u0000Extensive experiments on four benchmark datasets and a billion-scale industrial\u0000dataset demonstrate that FRGCF improves the performance by recommending more\u0000fascinating items and fewer unfascinating items. Besides, online A/B tests on\u0000Taobao's recommender system verify the superiority of FRGCF.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Query Understanding for Amazon Product Search","authors":"Chen Luo, Xianfeng Tang, Hanqing Lu, Yaochen Xie, Hui Liu, Zhenwei Dai, Limeng Cui, Ashutosh Joshi, Sreyashi Nag, Yang Li, Zhen Li, Rahul Goutam, Jiliang Tang, Haiyang Zhang, Qi He","doi":"arxiv-2408.02215","DOIUrl":"https://doi.org/arxiv-2408.02215","url":null,"abstract":"Online shopping platforms, such as Amazon, offer services to billions of\u0000people worldwide. Unlike web search or other search engines, product search\u0000engines have their unique characteristics, primarily featuring short queries\u0000which are mostly a combination of product attributes and structured product\u0000search space. The uniqueness of product search underscores the crucial\u0000importance of the query understanding component. However, there are limited\u0000studies focusing on exploring this impact within real-world product search\u0000engines. In this work, we aim to bridge this gap by conducting a comprehensive\u0000study and sharing our year-long journey investigating how the query\u0000understanding service impacts Amazon Product Search. Firstly, we explore how\u0000query understanding-based ranking features influence the ranking process. Next,\u0000we delve into how the query understanding system contributes to understanding\u0000the performance of a ranking model. Building on the insights gained from our\u0000study on the evaluation of the query understanding-based ranking model, we\u0000propose a query understanding-based multi-task learning framework for ranking.\u0000We present our studies and investigations using the real-world system on Amazon\u0000Search.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Embedding Compression in Recommender Systems: A Survey","authors":"Shiwei Li, Huifeng Guo, Xing Tang, Ruiming Tang, Lu Hou, Ruixuan Li, Rui Zhang","doi":"arxiv-2408.02304","DOIUrl":"https://doi.org/arxiv-2408.02304","url":null,"abstract":"To alleviate the problem of information explosion, recommender systems are\u0000widely deployed to provide personalized information filtering services.\u0000Usually, embedding tables are employed in recommender systems to transform\u0000high-dimensional sparse one-hot vectors into dense real-valued embeddings.\u0000However, the embedding tables are huge and account for most of the parameters\u0000in industrial-scale recommender systems. In order to reduce memory costs and\u0000improve efficiency, various approaches are proposed to compress the embedding\u0000tables. In this survey, we provide a comprehensive review of embedding\u0000compression approaches in recommender systems. We first introduce deep learning\u0000recommendation models and the basic concept of embedding compression in\u0000recommender systems. Subsequently, we systematically organize existing\u0000approaches into three categories, namely low-precision, mixed-dimension, and\u0000weight-sharing, respectively. Lastly, we summarize the survey with some general\u0000suggestions and provide future prospects for this field.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Danil Gusak, Gleb Mezentsev, Ivan Oseledets, Evgeny Frolov
{"title":"RECE: Reduced Cross-Entropy Loss for Large-Catalogue Sequential Recommenders","authors":"Danil Gusak, Gleb Mezentsev, Ivan Oseledets, Evgeny Frolov","doi":"arxiv-2408.02354","DOIUrl":"https://doi.org/arxiv-2408.02354","url":null,"abstract":"Scalability is a major challenge in modern recommender systems. In sequential\u0000recommendations, full Cross-Entropy (CE) loss achieves state-of-the-art\u0000recommendation quality but consumes excessive GPU memory with large item\u0000catalogs, limiting its practicality. Using a GPU-efficient locality-sensitive\u0000hashing-like algorithm for approximating large tensor of logits, this paper\u0000introduces a novel RECE (REduced Cross-Entropy) loss. RECE significantly\u0000reduces memory consumption while allowing one to enjoy the state-of-the-art\u0000performance of full CE loss. Experimental results on various datasets show that\u0000RECE cuts training peak memory usage by up to 12 times compared to existing\u0000methods while retaining or exceeding performance metrics of CE loss. The\u0000approach also opens up new possibilities for large-scale applications in other\u0000domains.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Entity Retrieval for Answering Entity-Centric Questions","authors":"Hassan S. Shavarani, Anoop Sarkar","doi":"arxiv-2408.02795","DOIUrl":"https://doi.org/arxiv-2408.02795","url":null,"abstract":"The similarity between the question and indexed documents is a crucial factor\u0000in document retrieval for retrieval-augmented question answering. Although this\u0000is typically the only method for obtaining the relevant documents, it is not\u0000the sole approach when dealing with entity-centric questions. In this study, we\u0000propose Entity Retrieval, a novel retrieval method which rather than relying on\u0000question-document similarity, depends on the salient entities within the\u0000question to identify the retrieval documents. We conduct an in-depth analysis\u0000of the performance of both dense and sparse retrieval methods in comparison to\u0000Entity Retrieval. Our findings reveal that our method not only leads to more\u0000accurate answers to entity-centric questions but also operates more\u0000efficiently.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"174 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}