{"title":"排序信息搜索的多重线性组合方法","authors":"Yizheng Huang, L. Zeng","doi":"10.1109/WI-IAT55865.2022.00119","DOIUrl":null,"url":null,"abstract":"Since the well-known BM25 [1] was proposed, BM25 and its enhanced version [2] – [4] have dominated the document/passage ranking task for a long time. However, with the advent of deep learning models like BERT [5] , these pre-trained models have achieved noticeable progress in various information retrieval (IR) tasks. But, as BM25 is a \"bag of words\" retrieval method by matching keywords, it remains a better option for passage ranking in some exceptional cases, like identifying names [6] . Therefore, fusing BM25 with deep learning models is a natural idea to benefit the ranking results. This paper discusses various linear methods of combing BM25 with BERT to see how they affect the final results of the models. We conduct experiments on the MS MARCO V2 dataset, which show convincing results.","PeriodicalId":345445,"journal":{"name":"2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiple Linear Combination Approaches for Information Search in Ranking\",\"authors\":\"Yizheng Huang, L. Zeng\",\"doi\":\"10.1109/WI-IAT55865.2022.00119\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Since the well-known BM25 [1] was proposed, BM25 and its enhanced version [2] – [4] have dominated the document/passage ranking task for a long time. However, with the advent of deep learning models like BERT [5] , these pre-trained models have achieved noticeable progress in various information retrieval (IR) tasks. But, as BM25 is a \\\"bag of words\\\" retrieval method by matching keywords, it remains a better option for passage ranking in some exceptional cases, like identifying names [6] . Therefore, fusing BM25 with deep learning models is a natural idea to benefit the ranking results. This paper discusses various linear methods of combing BM25 with BERT to see how they affect the final results of the models. We conduct experiments on the MS MARCO V2 dataset, which show convincing results.\",\"PeriodicalId\":345445,\"journal\":{\"name\":\"2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI-IAT55865.2022.00119\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI-IAT55865.2022.00119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
自著名的BM25[1]被提出以来,BM25及其增强版本[2]-[4]长期主导着文档/段落排序任务。然而,随着BERT[5]等深度学习模型的出现,这些预训练模型在各种信息检索(IR)任务中取得了显著进展。但是,由于BM25是一种通过匹配关键字的“词包”检索方法,因此在某些特殊情况下,例如识别姓名,它仍然是一种更好的选择[6]。因此,将BM25与深度学习模型融合是有利于排名结果的自然想法。本文讨论了BM25与BERT结合的各种线性方法,以了解它们如何影响模型的最终结果。我们在MS MARCO V2数据集上进行了实验,得到了令人信服的结果。
Multiple Linear Combination Approaches for Information Search in Ranking
Since the well-known BM25 [1] was proposed, BM25 and its enhanced version [2] – [4] have dominated the document/passage ranking task for a long time. However, with the advent of deep learning models like BERT [5] , these pre-trained models have achieved noticeable progress in various information retrieval (IR) tasks. But, as BM25 is a "bag of words" retrieval method by matching keywords, it remains a better option for passage ranking in some exceptional cases, like identifying names [6] . Therefore, fusing BM25 with deep learning models is a natural idea to benefit the ranking results. This paper discusses various linear methods of combing BM25 with BERT to see how they affect the final results of the models. We conduct experiments on the MS MARCO V2 dataset, which show convincing results.