Foundations and Trends in Information Retrieval最新文献

From Foundations to GPT in Text Classification: A Comprehensive Survey on Current Approaches and Future Trends 从文本分类的基础到 GPT：关于当前方法和未来趋势的全面调查

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2025-04-16 DOI: 10.1561/1500000107

Marco Siino, Ilenia Tinnirello, Marco La Cascia

{"title":"From Foundations to GPT in Text Classification: A Comprehensive Survey on Current Approaches and Future Trends","authors":"Marco Siino, Ilenia Tinnirello, Marco La Cascia","doi":"10.1561/1500000107","DOIUrl":"https://doi.org/10.1561/1500000107","url":null,"abstract":"\u0000Text classification stands as a cornerstone within the realm\u0000of Natural Language Processing (NLP), particularly when\u0000viewed through computer science and engineering. The past\u0000decade has seen deep learning revolutionize text classification,\u0000propelling advancements in text retrieval, categorization,\u0000information extraction, and summarization. The\u0000scholarly literature includes datasets, models, and evaluation\u0000criteria, with English being the predominant language of\u0000focus, despite studies involving Arabic, Chinese, Hindi, and\u0000others. The efficacy of text classification models relies heavily\u0000on their ability to capture intricate textual relationships\u0000and non-linear correlations, necessitating a comprehensive\u0000examination of the entire text classification pipeline.\u0000\u0000In the NLP domain, a plethora of text representation techniques\u0000and model architectures have emerged, with Large\u0000Language Models (LLMs) and Generative Pre-trained Transformers\u0000(GPTs) at the forefront. These models are adept at\u0000transforming extensive textual data into meaningful vector\u0000representations encapsulating semantic information. The\u0000multidisciplinary nature of text classification, encompassing\u0000data mining, linguistics, and information retrieval, highlights\u0000the importance of collaborative research to advance the field.\u0000This work integrates traditional and contemporary text mining\u0000methodologies, fostering a holistic understanding of text\u0000classification.\u0000\u0000This monograph provides an in-depth exploration of the\u0000text classification pipeline, with a particular emphasis on\u0000evaluating the impact of each component on the overall performance\u0000of text classification models. The pipeline includes\u0000state-of-the-art datasets, text preprocessing techniques, text\u0000representation methods, classification models, evaluation\u0000metrics, and future trends. Each section examines these\u0000stages, presenting technical innovations and recent findings.\u0000The work assesses various classification strategies, offering\u0000comparative analyses, examples and case studies. These\u0000contributions extend beyond a typical survey, providing a\u0000detailed and insightful exploration of the field.\u0000","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143841570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Search as Learning 作为学习的搜索

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2025-03-10 DOI: 10.1561/1500000084

Kelsey Urgo, Jaime Arguello

{"title":"Search as Learning","authors":"Kelsey Urgo, Jaime Arguello","doi":"10.1561/1500000084","DOIUrl":"https://doi.org/10.1561/1500000084","url":null,"abstract":"\u0000Search systems are often designed to support simple look-up tasks, such as fact-finding and navigation tasks. However, people increasingly use search engines to complete tasks that require deeper learning. In recent years, the search as learning (SAL) research community has argued that search systems should also be designed to support information-seeking tasks that involve complex learning as an important outcome. This monograph aims to provide a comprehensive review of prior research in search as learning and related areas. Searching to learn can be characterized by specific learning objectives, strategies, and context. Therefore, we begin by reviewing research in education that has aimed at characterizing learning objectives, strategies, and context. Then, we review methods used in prior studies to measure learning during a search session. Here, we discuss two important recommendations for future work: (1) measuring learning retention and (2) measuring a learner's ability to transfer their new knowledge to a novel scenario. Following this, we discuss studies that have focused on understanding factors that influence learning during search and search behaviors that are predictive of learning. Next, we survey tools that have been developed to support learning during search. Searching for the purpose of learning is often a solitary activity. Research in self-regulated learning (SRL) aims to understand how people monitor and control their own learning. Therefore, we review existing models of SRL, methods to measure engagement with specific SRL processes, and tools to support effective SRL. We conclude by discussing potential areas for future research.\u0000","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"68 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143599126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding and Mitigating Gender Bias in Information Retrieval Systems 理解和减轻信息检索系统中的性别偏见

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2025-02-09 DOI: 10.1561/1500000103

Shirin Seyedsalehi, Amin Bigdeli, Negar Arabzadeh, Batool AlMousawi, Zack Marshall, Morteza Zihayat, Ebrahim Bagheri

{"title":"Understanding and Mitigating Gender Bias in Information Retrieval Systems","authors":"Shirin Seyedsalehi, Amin Bigdeli, Negar Arabzadeh, Batool AlMousawi, Zack Marshall, Morteza Zihayat, Ebrahim Bagheri","doi":"10.1561/1500000103","DOIUrl":"https://doi.org/10.1561/1500000103","url":null,"abstract":"\u0000Gender bias is a pervasive issue that continues to influence various aspects of society, including the outcomes of information retrieval (IR) systems. As these systems become increasingly integral to accessing and navigating the vast amounts of information available today, the need to understand and mitigate gender bias within them is paramount. This monograph provides a comprehensive examination of the origins, manifestations, and consequences of gender bias in IR systems, as well as the current methodologies employed to address these biases. Theoretical frameworks surrounding gender and its representation in artificial intelligence (AI) systems are explored, particularly focusing on how traditional gender binaries are perpetuated and reinforced through data and algorithmic processes. Metrics and methodologies used to identify and measure gender bias within IR systems are then analyzed, offering a detailed evaluation of existing approaches and their limitations. Subsequent sections address the sources of gender bias, including biased input queries, retrieval methods, and gold standard datasets. Various data-driven and method-level debiasing strategies are presented, including techniques for debiasing neural embeddings and algorithmic approaches aimed at reducing bias in IR system outputs. The monograph concludes with a discussion of the challenges and limitations faced by current debiasing efforts and provides insights into future research directions that could lead to more equitable and inclusive IR systems.\u0000This monograph serves as a valuable resource for researchers, practitioners, and students in the fields of information retrieval, artificial intelligence, and data science, providing the knowledge and tools needed to address gender bias and contribute to the development of fair and unbiased information systems.\u0000","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"143 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143375149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mathematical Information Retrieval: Search and Question Answering 数学信息检索：搜索与问答

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2025-01-27 DOI: 10.1561/1500000095

Richard Zanibbi, Behrooz Mansouri, Anurag Agarwal

引用次数: 0

Information Discovery in E-commerce 电子商务中的信息发现

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2024-12-30 DOI: 10.1561/1500000097

Zhaochun Ren, Xiangnan He, Dawei Yin, Maarten de Rijke

{"title":"Information Discovery in E-commerce","authors":"Zhaochun Ren, Xiangnan He, Dawei Yin, Maarten de Rijke","doi":"10.1561/1500000097","DOIUrl":"https://doi.org/10.1561/1500000097","url":null,"abstract":"Electronic commerce, or e-commerce, is the buying and selling of goods and services, or the transmitting of funds or data online. E-commerce platforms come in many kinds, with global players such as Amazon, Airbnb, Alibaba, Booking.com, eBay, and JD.com and platforms targeting specific geographic regions such as Bol.com and Flipkart.com. Information retrieval has a natural role to play in e-commerce, especially in connecting people to goods and services. Information discovery in e-commerce concerns different types of search (e.g., exploratory search vs. lookup tasks), recommender systems, and natural language processing in e-commerce portals. The rise in popularity of e-commerce sites has made research on information discovery in e-commerce an increasingly active research area. This is witnessed by an increase in publications and dedicated workshops in this space. Methods for information discovery in e-commerce largely focus on improving the effectiveness of e-commerce search and recommender systems, on enriching and using knowledge graphs to support e-commerce, and on developing innovative question answering and bot-based solutions that help to connect people to goods and services. In this survey, an overview is given of the fundamental infrastructure, algorithms, and technical solutions for information discovery in e-commerce. The topics covered include user behavior and profiling, search, recommendation, and language technology in e-commerce.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"3 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fairness in Search Systems 搜索系统的公平性

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2024-12-23 DOI: 10.1561/1500000101

Yi Fang, Ashudeep Singh, Zhiqiang Tao

引用次数: 0

User Simulation for Evaluating Information Access Systems 评估信息获取系统的用户模拟

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2024-06-12 DOI: 10.1561/1500000098

Krisztian Balog, ChengXiang Zhai

{"title":"User Simulation for Evaluating Information Access Systems","authors":"Krisztian Balog, ChengXiang Zhai","doi":"10.1561/1500000098","DOIUrl":"https://doi.org/10.1561/1500000098","url":null,"abstract":"Information access systems, such as search engines, recommender\u0000systems, and conversational assistants, have become\u0000integral to our daily lives as they help us satisfy our information\u0000needs. However, evaluating the effectiveness of\u0000these systems presents a long-standing and complex scientific\u0000challenge. This challenge is rooted in the difficulty of\u0000assessing a system’s overall effectiveness in assisting users\u0000to complete tasks through interactive support, and further\u0000exacerbated by the substantial variation in user behaviour\u0000and preferences. To address this challenge, user simulation\u0000emerges as a promising solution.This monograph focuses on providing a thorough understanding\u0000of user simulation techniques designed specifically\u0000for evaluation purposes. We begin with a background of information\u0000access system evaluation and explore the diverse\u0000applications of user simulation. Subsequently, we systematically\u0000review the major research progress in user simulation,\u0000covering both general frameworks for designing user simulators,\u0000utilizing user simulation for evaluation, and specific\u0000models and algorithms for simulating user interactions with\u0000search engines, recommender systems, and conversational\u0000assistants. Realizing that user simulation is an interdisciplinary\u0000research topic, whenever possible, we attempt to\u0000establish connections with related fields, including machine\u0000learning, dialogue systems, user modeling, and economics.\u0000We end the monograph with a broad discussion of important\u0000future research directions, many of which extend beyond the\u0000evaluation of information access systems and are expected\u0000to have broader impact on how to evaluate interactive intelligent\u0000systems in general.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"33 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-hop Question Answering 多跳问题解答

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2024-06-12 DOI: 10.1561/1500000102

Vaibhav Mavi, Anubhav Jangra, Jatowt Adam

{"title":"Multi-hop Question Answering","authors":"Vaibhav Mavi, Anubhav Jangra, Jatowt Adam","doi":"10.1561/1500000102","DOIUrl":"https://doi.org/10.1561/1500000102","url":null,"abstract":"The task of Question Answering (QA) has attracted significant\u0000research interest for a long time. Its relevance to\u0000language understanding and knowledge retrieval tasks, along\u0000with the simple setting, makes the task of QA crucial for\u0000strong AI systems. Recent success on simple QA tasks has\u0000shifted the focus to more complex settings. Among these,\u0000Multi-Hop QA (MHQA) is one of the most researched tasks\u0000over recent years. In broad terms, MHQA is the task of answering\u0000natural language questions that involve extracting\u0000and combining multiple pieces of information and doing multiple\u0000steps of reasoning. An example of a multi-hop question\u0000would be “The Argentine PGA Championship record holder\u0000has won how many tournaments worldwide?”. Answering\u0000the question would need two pieces of information: “Who is\u0000the record holder for Argentine PGA Championship tournaments?”\u0000and “How many tournaments did [Answer of Sub\u0000Q1] win?”. The ability to answer multi-hop questions and\u0000perform multi step reasoning can significantly improve the\u0000utility of NLP systems. Consequently, the field has seen a\u0000surge of high quality datasets, models and evaluation strategies.\u0000The notion of ‘multiple hops’ is somewhat abstract\u0000which results in a large variety of tasks that require multihop\u0000reasoning. This leads to different datasets and models\u0000that differ significantly from each other and make the field\u0000challenging to generalize and survey. We aim to provide a\u0000general and formal definition of the MHQA task, and organize\u0000and summarize existing MHQA frameworks. We also\u0000outline some best practices for building MHQA datasets.\u0000This monograph provides a systematic and thorough introduction\u0000as well as the structuring of the existing attempts\u0000to this highly interesting, yet quite challenging task.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"44 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141315665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conversational Information Seeking 会话信息搜索

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2023-08-02 DOI: 10.1561/1500000081

Hamed Zamani, Johanne R. Trippas, Jeff Dalton, Filip Radlinski

引用次数: 49

Perspectives of Neurodiverse Participants in Interactive Information Retrieval 交互信息检索中神经多样性参与者的观点

IF 10.4 2区计算机科学

Foundations and Trends in Information Retrieval Pub Date : 2023-07-26 DOI: 10.1561/1500000086

Laurianne Sitbon, Gerd Berget, Margot Brereton

引用次数: 0