{"title":"Causal Discovery in Recommender Systems: Example and Discussion","authors":"Emanuele Cavenaghi, Fabio Stella, Markus Zanker","doi":"arxiv-2409.10271","DOIUrl":"https://doi.org/arxiv-2409.10271","url":null,"abstract":"Causality is receiving increasing attention by the artificial intelligence\u0000and machine learning communities. This paper gives an example of modelling a\u0000recommender system problem using causal graphs. Specifically, we approached the\u0000causal discovery task to learn a causal graph by combining observational data\u0000from an open-source dataset with prior knowledge. The resulting causal graph\u0000shows that only a few variables effectively influence the analysed feedback\u0000signals. This contrasts with the recent trend in the machine learning community\u0000to include more and more variables in massive models, such as neural networks.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"213 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AlpaPICO: Extraction of PICO Frames from Clinical Trial Documents Using LLMs","authors":"Madhusudan Ghosh, Shrimon Mukherjee, Asmit Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar, Debasis Ganguly","doi":"arxiv-2409.09704","DOIUrl":"https://doi.org/arxiv-2409.09704","url":null,"abstract":"In recent years, there has been a surge in the publication of clinical trial\u0000reports, making it challenging to conduct systematic reviews. Automatically\u0000extracting Population, Intervention, Comparator, and Outcome (PICO) from\u0000clinical trial studies can alleviate the traditionally time-consuming process\u0000of manually scrutinizing systematic reviews. Existing approaches of PICO frame\u0000extraction involves supervised approach that relies on the existence of\u0000manually annotated data points in the form of BIO label tagging. Recent\u0000approaches, such as In-Context Learning (ICL), which has been shown to be\u0000effective for a number of downstream NLP tasks, require the use of labeled\u0000examples. In this work, we adopt ICL strategy by employing the pretrained\u0000knowledge of Large Language Models (LLMs), gathered during the pretraining\u0000phase of an LLM, to automatically extract the PICO-related terminologies from\u0000clinical trial documents in unsupervised set up to bypass the availability of\u0000large number of annotated data instances. Additionally, to showcase the highest\u0000effectiveness of LLM in oracle scenario where large number of annotated samples\u0000are available, we adopt the instruction tuning strategy by employing Low Rank\u0000Adaptation (LORA) to conduct the training of gigantic model in low resource\u0000environment for the PICO frame extraction task. Our empirical results show that\u0000our proposed ICL-based framework produces comparable results on all the version\u0000of EBM-NLP datasets and the proposed instruction tuned version of our framework\u0000produces state-of-the-art results on all the different EBM-NLP datasets. Our\u0000project is available at url{https://github.com/shrimonmuke0202/AlpaPICO.git}.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measuring Recency Bias In Sequential Recommendation Systems","authors":"Jeonglyul Oh, Sungzoon Cho","doi":"arxiv-2409.09722","DOIUrl":"https://doi.org/arxiv-2409.09722","url":null,"abstract":"Recency bias in a sequential recommendation system refers to the overly high\u0000emphasis placed on recent items within a user session. This bias can diminish\u0000the serendipity of recommendations and hinder the system's ability to capture\u0000users' long-term interests, leading to user disengagement. We propose a simple\u0000yet effective novel metric specifically designed to quantify recency bias. Our\u0000findings also demonstrate that high recency bias measured in our proposed\u0000metric adversely impacts recommendation performance too, and mitigating it\u0000results in improved recommendation performances across all models evaluated in\u0000our experiments, thus highlighting the importance of measuring recency bias.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Sobhi Jabal, Pranav Warman, Jikai Zhang, Kartikeye Gupta, Ayush Jain, Maciej Mazurowski, Walter Wiggins, Kirti Magudia, Evan Calabrese
{"title":"Language Models and Retrieval Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports","authors":"Mohamed Sobhi Jabal, Pranav Warman, Jikai Zhang, Kartikeye Gupta, Ayush Jain, Maciej Mazurowski, Walter Wiggins, Kirti Magudia, Evan Calabrese","doi":"arxiv-2409.10576","DOIUrl":"https://doi.org/arxiv-2409.10576","url":null,"abstract":"Purpose: To develop and evaluate an automated system for extracting\u0000structured clinical information from unstructured radiology and pathology\u0000reports using open-weights large language models (LMs) and retrieval augmented\u0000generation (RAG), and to assess the effects of model configuration variables on\u0000extraction performance. Methods and Materials: The study utilized two datasets:\u00007,294 radiology reports annotated for Brain Tumor Reporting and Data System\u0000(BT-RADS) scores and 2,154 pathology reports annotated for isocitrate\u0000dehydrogenase (IDH) mutation status. An automated pipeline was developed to\u0000benchmark the performance of various LMs and RAG configurations. The impact of\u0000model size, quantization, prompting strategies, output formatting, and\u0000inference parameters was systematically evaluated. Results: The best performing\u0000models achieved over 98% accuracy in extracting BT-RADS scores from radiology\u0000reports and over 90% for IDH mutation status extraction from pathology reports.\u0000The top model being medical fine-tuned llama3. Larger, newer, and domain\u0000fine-tuned models consistently outperformed older and smaller models. Model\u0000quantization had minimal impact on performance. Few-shot prompting\u0000significantly improved accuracy. RAG improved performance for complex pathology\u0000reports but not for shorter radiology reports. Conclusions: Open LMs\u0000demonstrate significant potential for automated extraction of structured\u0000clinical data from unstructured clinical reports with local privacy-preserving\u0000application. Careful model selection, prompt engineering, and semi-automated\u0000optimization using annotated data are critical for optimal performance. These\u0000approaches could be reliable enough for practical use in research workflows,\u0000highlighting the potential for human-machine collaboration in healthcare data\u0000extraction.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CROSS-JEM: Accurate and Efficient Cross-encoders for Short-text Ranking Tasks","authors":"Bhawna Paliwal, Deepak Saini, Mudit Dhawan, Siddarth Asokan, Nagarajan Natarajan, Surbhi Aggarwal, Pankaj Malhotra, Jian Jiao, Manik Varma","doi":"arxiv-2409.09795","DOIUrl":"https://doi.org/arxiv-2409.09795","url":null,"abstract":"Ranking a set of items based on their relevance to a given query is a core\u0000problem in search and recommendation. Transformer-based ranking models are the\u0000state-of-the-art approaches for such tasks, but they score each query-item\u0000independently, ignoring the joint context of other relevant items. This leads\u0000to sub-optimal ranking accuracy and high computational costs. In response, we\u0000propose Cross-encoders with Joint Efficient Modeling (CROSS-JEM), a novel\u0000ranking approach that enables transformer-based models to jointly score\u0000multiple items for a query, maximizing parameter utilization. CROSS-JEM\u0000leverages (a) redundancies and token overlaps to jointly score multiple items,\u0000that are typically short-text phrases arising in search and recommendations,\u0000and (b) a novel training objective that models ranking probabilities. CROSS-JEM\u0000achieves state-of-the-art accuracy and over 4x lower ranking latency over\u0000standard cross-encoders. Our contributions are threefold: (i) we highlight the\u0000gap between the ranking application's need for scoring thousands of items per\u0000query and the limited capabilities of current cross-encoders; (ii) we introduce\u0000CROSS-JEM for joint efficient scoring of multiple items per query; and (iii) we\u0000demonstrate state-of-the-art accuracy on standard public datasets and a\u0000proprietary dataset. CROSS-JEM opens up new directions for designing tailored\u0000early-attention-based ranking models that incorporate strict production\u0000constraints such as item multiplicity and latency.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Yin, Zhengxin Zeng, Mingzheng Li, Hao Yan, Chaozhuo Li, Weihao Han, Jianjin Zhang, Ruochen Liu, Allen Sun, Denvy Deng, Feng Sun, Qi Zhang, Shirui Pan, Senzhang Wang
{"title":"Unleash LLMs Potential for Recommendation by Coordinating Twin-Tower Dynamic Semantic Token Generator","authors":"Jun Yin, Zhengxin Zeng, Mingzheng Li, Hao Yan, Chaozhuo Li, Weihao Han, Jianjin Zhang, Ruochen Liu, Allen Sun, Denvy Deng, Feng Sun, Qi Zhang, Shirui Pan, Senzhang Wang","doi":"arxiv-2409.09253","DOIUrl":"https://doi.org/arxiv-2409.09253","url":null,"abstract":"Owing to the unprecedented capability in semantic understanding and logical\u0000reasoning, the pre-trained large language models (LLMs) have shown fantastic\u0000potential in developing the next-generation recommender systems (RSs). However,\u0000the static index paradigm adopted by current methods greatly restricts the\u0000utilization of LLMs capacity for recommendation, leading to not only the\u0000insufficient alignment between semantic and collaborative knowledge, but also\u0000the neglect of high-order user-item interaction patterns. In this paper, we\u0000propose Twin-Tower Dynamic Semantic Recommender (TTDS), the first generative RS\u0000which adopts dynamic semantic index paradigm, targeting at resolving the above\u0000problems simultaneously. To be more specific, we for the first time contrive a\u0000dynamic knowledge fusion framework which integrates a twin-tower semantic token\u0000generator into the LLM-based recommender, hierarchically allocating meaningful\u0000semantic index for items and users, and accordingly predicting the semantic\u0000index of target item. Furthermore, a dual-modality variational auto-encoder is\u0000proposed to facilitate multi-grained alignment between semantic and\u0000collaborative knowledge. Eventually, a series of novel tuning tasks specially\u0000customized for capturing high-order user-item interaction patterns are proposed\u0000to take advantages of user historical behavior. Extensive experiments across\u0000three public datasets demonstrate the superiority of the proposed methodology\u0000in developing LLM-based generative RSs. The proposed TTDS recommender achieves\u0000an average improvement of 19.41% in Hit-Rate and 20.84% in NDCG metric,\u0000compared with the leading baseline methods.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Farnoosh Javadi, Phanideep Gampa, Alyssa Woo, Xingxing Geng, Hang Zhang, Jose Sepulveda, Belhassen Bayar, Fei Wang
{"title":"LLM-based Weak Supervision Framework for Query Intent Classification in Video Search","authors":"Farnoosh Javadi, Phanideep Gampa, Alyssa Woo, Xingxing Geng, Hang Zhang, Jose Sepulveda, Belhassen Bayar, Fei Wang","doi":"arxiv-2409.08931","DOIUrl":"https://doi.org/arxiv-2409.08931","url":null,"abstract":"Streaming services have reshaped how we discover and engage with digital\u0000entertainment. Despite these advancements, effectively understanding the wide\u0000spectrum of user search queries continues to pose a significant challenge. An\u0000accurate query understanding system that can handle a variety of entities that\u0000represent different user intents is essential for delivering an enhanced user\u0000experience. We can build such a system by training a natural language\u0000understanding (NLU) model; however, obtaining high-quality labeled training\u0000data in this specialized domain is a substantial obstacle. Manual annotation is\u0000costly and impractical for capturing users' vast vocabulary variations. To\u0000address this, we introduce a novel approach that leverages large language\u0000models (LLMs) through weak supervision to automatically annotate a vast\u0000collection of user search queries. Using prompt engineering and a diverse set\u0000of LLM personas, we generate training data that matches human annotator\u0000expectations. By incorporating domain knowledge via Chain of Thought and\u0000In-Context Learning, our approach leverages the labeled data to train\u0000low-latency models optimized for real-time inference. Extensive evaluations\u0000demonstrated that our approach outperformed the baseline with an average\u0000relative gain of 113% in recall. Furthermore, our novel prompt engineering\u0000framework yields higher quality LLM-generated data to be used for weak\u0000supervision; we observed 47.60% improvement over baseline in agreement rate\u0000between LLM predictions and human annotations with respect to F1 score,\u0000weighted according to the distribution of occurrences of the search queries.\u0000Our persona selection routing mechanism further adds an additional 3.67%\u0000increase in weighted F1 score on top of our novel prompt engineering framework.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Esmaeil NarimissaAustralian Taxation Office, David RaithelAustralian Taxation Office
{"title":"Exploring Information Retrieval Landscapes: An Investigation of a Novel Evaluation Techniques and Comparative Document Splitting Methods","authors":"Esmaeil NarimissaAustralian Taxation Office, David RaithelAustralian Taxation Office","doi":"arxiv-2409.08479","DOIUrl":"https://doi.org/arxiv-2409.08479","url":null,"abstract":"The performance of Retrieval-Augmented Generation (RAG) systems in\u0000information retrieval is significantly influenced by the characteristics of the\u0000documents being processed. In this study, the structured nature of textbooks,\u0000the conciseness of articles, and the narrative complexity of novels are shown\u0000to require distinct retrieval strategies. A comparative evaluation of multiple\u0000document-splitting methods reveals that the Recursive Character Splitter\u0000outperforms the Token-based Splitter in preserving contextual integrity. A\u0000novel evaluation technique is introduced, utilizing an open-source model to\u0000generate a comprehensive dataset of question-and-answer pairs, simulating\u0000realistic retrieval scenarios to enhance testing efficiency and metric\u0000reliability. The evaluation employs weighted scoring metrics, including\u0000SequenceMatcher, BLEU, METEOR, and BERT Score, to assess the system's accuracy\u0000and relevance. This approach establishes a refined standard for evaluating the\u0000precision of RAG systems, with future research focusing on optimizing chunk and\u0000overlap sizes to improve retrieval accuracy and efficiency.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contri(e)ve: Context + Retrieve for Scholarly Question Answering","authors":"Kanchan Shivashankar, Nadine Steinmetz","doi":"arxiv-2409.09010","DOIUrl":"https://doi.org/arxiv-2409.09010","url":null,"abstract":"Scholarly communication is a rapid growing field containing a wealth of\u0000knowledge. However, due to its unstructured and document format, it is\u0000challenging to extract useful information from them through conventional\u0000document retrieval methods. Scholarly knowledge graphs solve this problem, by\u0000representing the documents in a semantic network, providing, hidden insights,\u0000summaries and ease of accessibility through queries. Naturally, question\u0000answering for scholarly graphs expands the accessibility to a wider audience.\u0000But some of the knowledge in this domain is still presented as unstructured\u0000text, thus requiring a hybrid solution for question answering systems. In this\u0000paper, we present a two step solution using open source Large Language\u0000Model(LLM): Llama3.1 for Scholarly-QALD dataset. Firstly, we extract the\u0000context pertaining to the question from different structured and unstructured\u0000data sources: DBLP, SemOpenAlex knowledge graphs and Wikipedia text. Secondly,\u0000we implement prompt engineering to improve the information retrieval\u0000performance of the LLM. Our approach achieved an F1 score of 40% and also\u0000observed some anomalous responses from the LLM, that are discussed in the final\u0000part of the paper.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ATFLRec: A Multimodal Recommender System with Audio-Text Fusion and Low-Rank Adaptation via Instruction-Tuned Large Language Model","authors":"Zezheng Qin","doi":"arxiv-2409.08543","DOIUrl":"https://doi.org/arxiv-2409.08543","url":null,"abstract":"Recommender Systems (RS) play a pivotal role in boosting user satisfaction by\u0000providing personalized product suggestions in domains such as e-commerce and\u0000entertainment. This study examines the integration of multimodal data text and\u0000audio into large language models (LLMs) with the aim of enhancing\u0000recommendation performance. Traditional text and audio recommenders encounter\u0000limitations such as the cold-start problem, and recent advancements in LLMs,\u0000while promising, are computationally expensive. To address these issues,\u0000Low-Rank Adaptation (LoRA) is introduced, which enhances efficiency without\u0000compromising performance. The ATFLRec framework is proposed to integrate audio\u0000and text modalities into a multimodal recommendation system, utilizing various\u0000LoRA configurations and modality fusion techniques. Results indicate that\u0000ATFLRec outperforms baseline models, including traditional and graph neural\u0000network-based approaches, achieving higher AUC scores. Furthermore, separate\u0000fine-tuning of audio and text data with distinct LoRA modules yields optimal\u0000performance, with different pooling methods and Mel filter bank numbers\u0000significantly impacting performance. This research offers valuable insights\u0000into optimizing multimodal recommender systems and advancing the integration of\u0000diverse data modalities in LLMs.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}