Pretrained Transformers for Text Ranking: BERT and Beyond

IF 5.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics Pub Date : 2022-11-07 DOI:10.1162/coli_r_00468

S. Verberne

{"title":"Pretrained Transformers for Text Ranking: BERT and Beyond","authors":"S. Verberne","doi":"10.1162/coli_r_00468","DOIUrl":null,"url":null,"abstract":"Text ranking takes a central place in Information Retrieval (IR), with Web search as its best-known application. More generally, text ranking models are applicable to any Natural Language Processing (NLP) task in which relevance of information plays a role, from filtering and recommendation applications to question answering and semantic similarity comparisons. Since the rise of BERT in 2019, Transformer models have become the most used and studied architectures in both NLP and IR, and they have been applied to basically any task in our research fields—including text ranking. In a fast-changing research context, it can be challenging to keep lecture materials up to date. Lecturers in NLP are grateful for Dan Jurafsky and James Martin for yearly updating the 3rd edition of their textbook, making Speech and Language Processing the most comprehensive, modern textbook for NLP. The IR field is less fortunate, still relying on older textbooks, extended with a collection of recent materials that address neural models. The textbook Pretrained Transformers for Text Ranking: BERT and Beyond by Jimmy Lin, Rodrigo Nogueira, and Andrew Yates is a great effort to collect the recent developments in the use of Transformers for text ranking. The introduction of the book is well-scoped with clear guidance for the reader about topics that are out of scope (such as user aspects). This is followed by an excellent history section, stating for example:","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"253-255"},"PeriodicalIF":5.3000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/coli_r_00468","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 3

Abstract

Text ranking takes a central place in Information Retrieval (IR), with Web search as its best-known application. More generally, text ranking models are applicable to any Natural Language Processing (NLP) task in which relevance of information plays a role, from filtering and recommendation applications to question answering and semantic similarity comparisons. Since the rise of BERT in 2019, Transformer models have become the most used and studied architectures in both NLP and IR, and they have been applied to basically any task in our research fields—including text ranking. In a fast-changing research context, it can be challenging to keep lecture materials up to date. Lecturers in NLP are grateful for Dan Jurafsky and James Martin for yearly updating the 3rd edition of their textbook, making Speech and Language Processing the most comprehensive, modern textbook for NLP. The IR field is less fortunate, still relying on older textbooks, extended with a collection of recent materials that address neural models. The textbook Pretrained Transformers for Text Ranking: BERT and Beyond by Jimmy Lin, Rodrigo Nogueira, and Andrew Yates is a great effort to collect the recent developments in the use of Transformers for text ranking. The introduction of the book is well-scoped with clear guidance for the reader about topics that are out of scope (such as user aspects). This is followed by an excellent history section, stating for example:

查看原文本刊更多论文

预训练的文本排名转换器：BERT及其后

文本排序在信息检索(Information Retrieval, IR)中占有中心地位，其中Web搜索是其最著名的应用。更一般地说，文本排序模型适用于任何自然语言处理(NLP)任务，其中信息的相关性起着作用，从过滤和推荐应用到问答和语义相似性比较。自2019年BERT兴起以来，Transformer模型已成为NLP和IR中使用最多和研究最多的架构，它们已被应用于我们研究领域的基本任何任务，包括文本排名。在瞬息万变的研究环境中，使讲座材料保持最新可能是一项挑战。NLP的讲师感谢Dan Jurafsky和James Martin每年更新他们的第三版教科书，使语音和语言处理成为最全面，最现代的NLP教科书。红外领域就没那么幸运了，仍然依赖于旧的教科书，并扩展了一些关于神经模型的最新材料。由Jimmy Lin、Rodrigo Nogueira和Andrew Yates编写的教科书《用于文本排序的预训练变形金刚:BERT and Beyond》是收集使用变形金刚进行文本排序的最新发展的一项巨大努力。这本书的引言有很好的范围，对超出范围的主题(例如用户方面)为读者提供了明确的指导。接下来是一个优秀的历史部分，例如:

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computational Linguistics 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Computational Linguistics, the longest-running publication dedicated solely to the computational and mathematical aspects of language and the design of natural language processing systems, provides university and industry linguists, computational linguists, AI and machine learning researchers, cognitive scientists, speech specialists, and philosophers with the latest insights into the computational aspects of language research.