概率关联框架:BM25及以后

IF 8.3 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
S. Robertson, H. Zaragoza
{"title":"概率关联框架:BM25及以后","authors":"S. Robertson, H. Zaragoza","doi":"10.1561/1500000019","DOIUrl":null,"url":null,"abstract":"The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970—1980s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account document meta-data (especially structure and link-graph information). Again, this has led to one of the most successful Web-search and corporate-search algorithms, BM25F. This work presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25 and BM25F. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimisation for models with free parameters.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"35 1","pages":"333-389"},"PeriodicalIF":8.3000,"publicationDate":"2009-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2328","resultStr":"{\"title\":\"The Probabilistic Relevance Framework: BM25 and Beyond\",\"authors\":\"S. Robertson, H. Zaragoza\",\"doi\":\"10.1561/1500000019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970—1980s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account document meta-data (especially structure and link-graph information). Again, this has led to one of the most successful Web-search and corporate-search algorithms, BM25F. This work presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25 and BM25F. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimisation for models with free parameters.\",\"PeriodicalId\":48829,\"journal\":{\"name\":\"Foundations and Trends in Information Retrieval\",\"volume\":\"35 1\",\"pages\":\"333-389\"},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2009-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2328\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foundations and Trends in Information Retrieval\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1561/1500000019\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations and Trends in Information Retrieval","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1561/1500000019","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2328

摘要

概率相关框架(PRF)是一个用于文档检索的正式框架,以1970 - 1980年代的工作为基础,它导致了最成功的文本检索算法之一BM25的发展。近年来,PRF的研究产生了能够考虑文档元数据(特别是结构和链接图信息)的新的检索模型。同样,这导致了最成功的web搜索和企业搜索算法之一BM25F。这项工作从概念的角度介绍了PRF,描述了框架背后的概率建模假设以及由其应用产生的不同排名算法:二元独立模型、相关反馈模型、BM25和BM25F。它还讨论了PRF和其他IR统计模型之间的关系,并涵盖了一些相关主题,例如非文本特征的使用,以及具有自由参数的模型的参数优化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Probabilistic Relevance Framework: BM25 and Beyond
The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970—1980s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account document meta-data (especially structure and link-graph information). Again, this has led to one of the most successful Web-search and corporate-search algorithms, BM25F. This work presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25 and BM25F. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimisation for models with free parameters.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Foundations and Trends in Information Retrieval
Foundations and Trends in Information Retrieval COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
39.10
自引率
0.00%
发文量
3
期刊介绍: The surge in research across all domains in the past decade has resulted in a plethora of new publications, causing an exponential growth in published research. Navigating through this extensive literature and staying current has become a time-consuming challenge. While electronic publishing provides instant access to more articles than ever, discerning the essential ones for a comprehensive understanding of any topic remains an issue. To tackle this, Foundations and Trends® in Information Retrieval - FnTIR - addresses the problem by publishing high-quality survey and tutorial monographs in the field. Each issue of Foundations and Trends® in Information Retrieval - FnT IR features a 50-100 page monograph authored by research leaders, covering tutorial subjects, research retrospectives, and survey papers that provide state-of-the-art reviews within the scope of the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信