使用前向索引和轻量级编码器的高效神经排序

IF 9.1 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems Pub Date : 2023-11-08 DOI:10.1145/3631939

Jurek Leonhardt, Henrik Müller, Koustav Rudra, Megha Khosla, Abhijit Anand, Avishek Anand

{"title":"使用前向索引和轻量级编码器的高效神经排序","authors":"Jurek Leonhardt, Henrik Müller, Koustav Rudra, Megha Khosla, Abhijit Anand, Avishek Anand","doi":"10.1145/3631939","DOIUrl":null,"url":null,"abstract":"Dual-encoder-based dense retrieval models have become the standard in IR. They employ large Transformer-based language models, which are notoriously inefficient in terms of resources and latency. We propose Fast-Forward indexes—vector forward indexes which exploit the semantic matching capabilities of dual-encoder models for efficient and effective re-ranking. Our framework enables re-ranking at very high retrieval depths and combines the merits of both lexical and semantic matching via score interpolation. Furthermore, in order to mitigate the limitations of dual-encoders, we tackle two main challenges: Firstly, we improve computational efficiency by either pre-computing representations, avoiding unnecessary computations altogether, or reducing the complexity of encoders. This allows us to considerably improve ranking efficiency and latency. Secondly, we optimize the memory footprint and maintenance cost of indexes; we propose two complementary techniques to reduce the index size and show that, by dynamically dropping irrelevant document tokens, the index maintenance efficiency can be improved substantially. We perform evaluation to show the effectiveness and efficiency of Fast-Forward indexes—our method has low latency and achieves competitive results without the need for hardware acceleration, such as GPUs.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"65 s297","pages":"0"},"PeriodicalIF":9.1000,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Efficient Neural Ranking using Forward Indexes and Lightweight Encoders\",\"authors\":\"Jurek Leonhardt, Henrik Müller, Koustav Rudra, Megha Khosla, Abhijit Anand, Avishek Anand\",\"doi\":\"10.1145/3631939\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dual-encoder-based dense retrieval models have become the standard in IR. They employ large Transformer-based language models, which are notoriously inefficient in terms of resources and latency. We propose Fast-Forward indexes—vector forward indexes which exploit the semantic matching capabilities of dual-encoder models for efficient and effective re-ranking. Our framework enables re-ranking at very high retrieval depths and combines the merits of both lexical and semantic matching via score interpolation. Furthermore, in order to mitigate the limitations of dual-encoders, we tackle two main challenges: Firstly, we improve computational efficiency by either pre-computing representations, avoiding unnecessary computations altogether, or reducing the complexity of encoders. This allows us to considerably improve ranking efficiency and latency. Secondly, we optimize the memory footprint and maintenance cost of indexes; we propose two complementary techniques to reduce the index size and show that, by dynamically dropping irrelevant document tokens, the index maintenance efficiency can be improved substantially. We perform evaluation to show the effectiveness and efficiency of Fast-Forward indexes—our method has low latency and achieves competitive results without the need for hardware acceleration, such as GPUs.\",\"PeriodicalId\":50936,\"journal\":{\"name\":\"ACM Transactions on Information Systems\",\"volume\":\"65 s297\",\"pages\":\"0\"},\"PeriodicalIF\":9.1000,\"publicationDate\":\"2023-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3631939\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3631939","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 1

摘要

基于双编码器的密集检索模型已经成为红外领域的标准。它们采用大型的基于transformer的语言模型，这在资源和延迟方面是出了名的低效。我们提出了快速前向索引——矢量前向索引，利用双编码器模型的语义匹配能力进行高效的重新排序。我们的框架能够在非常高的检索深度上重新排序，并通过分数插值结合了词汇和语义匹配的优点。此外，为了减轻双编码器的局限性，我们解决了两个主要挑战:首先，我们通过预计算表示来提高计算效率，避免不必要的计算，或者降低编码器的复杂性。这使我们能够大大提高排名效率和延迟。其次，优化索引的内存占用和维护成本;我们提出了两种互补的技术来减少索引大小，并表明，通过动态删除不相关的文档令牌，可以大大提高索引维护效率。我们执行评估以显示Fast-Forward索引的有效性和效率-我们的方法具有低延迟，并且无需硬件加速(如gpu)即可获得具有竞争力的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient Neural Ranking using Forward Indexes and Lightweight Encoders

Dual-encoder-based dense retrieval models have become the standard in IR. They employ large Transformer-based language models, which are notoriously inefficient in terms of resources and latency. We propose Fast-Forward indexes—vector forward indexes which exploit the semantic matching capabilities of dual-encoder models for efficient and effective re-ranking. Our framework enables re-ranking at very high retrieval depths and combines the merits of both lexical and semantic matching via score interpolation. Furthermore, in order to mitigate the limitations of dual-encoders, we tackle two main challenges: Firstly, we improve computational efficiency by either pre-computing representations, avoiding unnecessary computations altogether, or reducing the complexity of encoders. This allows us to considerably improve ranking efficiency and latency. Secondly, we optimize the memory footprint and maintenance cost of indexes; we propose two complementary techniques to reduce the index size and show that, by dynamically dropping irrelevant document tokens, the index maintenance efficiency can be improved substantially. We perform evaluation to show the effectiveness and efficiency of Fast-Forward indexes—our method has low latency and achieves competitive results without the need for hardware acceleration, such as GPUs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Information Systems 工程技术-计算机：信息系统

CiteScore

9.40

自引率

14.30%

发文量

165

审稿时长

>12 weeks

期刊介绍： The ACM Transactions on Information Systems (TOIS) publishes papers on information retrieval (such as search engines, recommender systems) that contain: new principled information retrieval models or algorithms with sound empirical validation; observational, experimental and/or theoretical studies yielding new insights into information retrieval or information seeking; accounts of applications of existing information retrieval techniques that shed light on the strengths and weaknesses of the techniques; formalization of new information retrieval or information seeking tasks and of methods for evaluating the performance on those tasks; development of content (text, image, speech, video, etc) analysis methods to support information retrieval and information seeking; development of computational models of user information preferences and interaction behaviors; creation and analysis of evaluation methodologies for information retrieval and information seeking; or surveys of existing work that propose a significant synthesis. The information retrieval scope of ACM Transactions on Information Systems (TOIS) appeals to industry practitioners for its wealth of creative ideas, and to academic researchers for its descriptions of their colleagues'' work.