OpenNIR:一个完整的神经自组织排序管道

Proceedings of the 13th International Conference on Web Search and Data Mining Pub Date : 2020-01-20 DOI:10.1145/3336191.3371864

Sean MacAvaney

{"title":"OpenNIR:一个完整的神经自组织排序管道","authors":"Sean MacAvaney","doi":"10.1145/3336191.3371864","DOIUrl":null,"url":null,"abstract":"With the growing popularity of neural approaches for ad-hoc ranking, there is a need for tools that can effectively reproduce prior results and ease continued research by supporting current state-of-the-art approaches. Although several excellent neural ranking tools exist, none offer an easy end-to-end ad-hoc neural raking pipeline. A complete pipeline is particularly important for ad-hoc ranking because there are numerous parameter settings that have a considerable effect on the ultimate performance yet often are under-reported in current work (e.g., initial ranking settings, re-ranking threshold, training sampling strategy, etc.). In this work, I present a complete ad-hoc neural ranking pipeline which addresses these shortcomings: OpenNIR. The pipeline is easy to use (a single command will download required data, train, and evaluate a model), yet highly configurable, allowing for continued work in areas that are understudied. Aside from the core pipeline, the software also includes several bells and whistles that make use of components of the pipeline, such as performance benchmarking and tuning of unsupervised ranker parameters for fair comparisons against traditional baselines. The pipeline and these capabilities are demonstrated. The code is available, and contributions are welcome.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":"{\"title\":\"OpenNIR: A Complete Neural Ad-Hoc Ranking Pipeline\",\"authors\":\"Sean MacAvaney\",\"doi\":\"10.1145/3336191.3371864\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the growing popularity of neural approaches for ad-hoc ranking, there is a need for tools that can effectively reproduce prior results and ease continued research by supporting current state-of-the-art approaches. Although several excellent neural ranking tools exist, none offer an easy end-to-end ad-hoc neural raking pipeline. A complete pipeline is particularly important for ad-hoc ranking because there are numerous parameter settings that have a considerable effect on the ultimate performance yet often are under-reported in current work (e.g., initial ranking settings, re-ranking threshold, training sampling strategy, etc.). In this work, I present a complete ad-hoc neural ranking pipeline which addresses these shortcomings: OpenNIR. The pipeline is easy to use (a single command will download required data, train, and evaluate a model), yet highly configurable, allowing for continued work in areas that are understudied. Aside from the core pipeline, the software also includes several bells and whistles that make use of components of the pipeline, such as performance benchmarking and tuning of unsupervised ranker parameters for fair comparisons against traditional baselines. The pipeline and these capabilities are demonstrated. The code is available, and contributions are welcome.\",\"PeriodicalId\":319008,\"journal\":{\"name\":\"Proceedings of the 13th International Conference on Web Search and Data Mining\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3336191.3371864\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3336191.3371864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

摘要

随着用于特别排序的神经方法的日益普及，需要能够有效地重现先前结果并通过支持当前最先进的方法来简化继续研究的工具。虽然存在一些优秀的神经排序工具，但没有一个提供简单的端到端特设神经排序管道。一个完整的流水线对于临时排序尤其重要，因为有许多参数设置对最终性能有相当大的影响，但在当前的工作中往往没有得到充分的报告(例如，初始排序设置、重新排序阈值、训练抽样策略等)。在这项工作中，我提出了一个完整的特设神经排序管道来解决这些缺点:OpenNIR。该管道易于使用(只需一个命令即可下载所需的数据、训练和评估模型)，并且高度可配置，允许在未充分研究的领域继续工作。除了核心管道之外，该软件还包括一些利用管道组件的附加功能，例如性能基准测试和调整无监督排名参数，以便与传统基线进行公平比较。演示了管道和这些功能。代码是可用的，欢迎贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

OpenNIR: A Complete Neural Ad-Hoc Ranking Pipeline

With the growing popularity of neural approaches for ad-hoc ranking, there is a need for tools that can effectively reproduce prior results and ease continued research by supporting current state-of-the-art approaches. Although several excellent neural ranking tools exist, none offer an easy end-to-end ad-hoc neural raking pipeline. A complete pipeline is particularly important for ad-hoc ranking because there are numerous parameter settings that have a considerable effect on the ultimate performance yet often are under-reported in current work (e.g., initial ranking settings, re-ranking threshold, training sampling strategy, etc.). In this work, I present a complete ad-hoc neural ranking pipeline which addresses these shortcomings: OpenNIR. The pipeline is easy to use (a single command will download required data, train, and evaluate a model), yet highly configurable, allowing for continued work in areas that are understudied. Aside from the core pipeline, the software also includes several bells and whistles that make use of components of the pipeline, such as performance benchmarking and tuning of unsupervised ranker parameters for fair comparisons against traditional baselines. The pipeline and these capabilities are demonstrated. The code is available, and contributions are welcome.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 13th International Conference on Web Search and Data Mining

自引率

0.00%

发文量