Fernando Ferraretto, Thiago Laitz, R. Lotufo, Rodrigo Nogueira
{"title":"ExaRanker: Synthetic Explanations Improve Neural Rankers","authors":"Fernando Ferraretto, Thiago Laitz, R. Lotufo, Rodrigo Nogueira","doi":"10.1145/3539618.3592067","DOIUrl":null,"url":null,"abstract":"Recent work has shown that incorporating explanations into the output generated by large language models (LLMs) can significantly enhance performance on a broad spectrum of reasoning tasks. Our study extends these findings by demonstrating the benefits of explanations for neural rankers. By utilizing LLMs such as GPT-3.5 to enrich retrieval datasets with explanations, we trained a sequence-to-sequence ranking model, dubbed ExaRanker, to generate relevance labels and explanations for query-document pairs. The ExaRanker model, finetuned on a limited number of examples and synthetic explanations, exhibits performance comparable to models finetuned on three times more examples, but without explanations. Moreover, incorporating explanations imposes no additional computational overhead into the reranking step and allows for on-demand explanation generation. The codebase and datasets used in this study will be available at https://github.com/unicamp-dl/ExaRanker","PeriodicalId":425056,"journal":{"name":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3539618.3592067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recent work has shown that incorporating explanations into the output generated by large language models (LLMs) can significantly enhance performance on a broad spectrum of reasoning tasks. Our study extends these findings by demonstrating the benefits of explanations for neural rankers. By utilizing LLMs such as GPT-3.5 to enrich retrieval datasets with explanations, we trained a sequence-to-sequence ranking model, dubbed ExaRanker, to generate relevance labels and explanations for query-document pairs. The ExaRanker model, finetuned on a limited number of examples and synthetic explanations, exhibits performance comparable to models finetuned on three times more examples, but without explanations. Moreover, incorporating explanations imposes no additional computational overhead into the reranking step and allows for on-demand explanation generation. The codebase and datasets used in this study will be available at https://github.com/unicamp-dl/ExaRanker