使用可复制性度量评估时间持久性

Conference and Labs of the Evaluation Forum Pub Date : 2023-08-21 DOI:10.48550/arXiv.2308.10549

Jüri Keller, Timo Breuer, Philipp Schaer

{"title":"使用可复制性度量评估时间持久性","authors":"Jüri Keller, Timo Breuer, Philipp Schaer","doi":"10.48550/arXiv.2308.10549","DOIUrl":null,"url":null,"abstract":"In real-world Information Retrieval (IR) experiments, the Evaluation Environment (EE) is exposed to constant change. Documents are added, removed, or updated, and the information need and the search behavior of users is evolving. Simultaneously, IR systems are expected to retain a consistent quality. The LongEval Lab seeks to investigate the longitudinal persistence of IR systems, and in this work, we describe our participation. We submitted runs of five advanced retrieval systems, namely a Reciprocal Rank Fusion (RRF) approach, ColBERT, monoT5, Doc2Query, and E5, to both sub-tasks. Further, we cast the longitudinal evaluation as a replicability study to better understand the temporal change observed. As a result, we quantify the persistence of the submitted runs and see great potential in this evaluation method.","PeriodicalId":232729,"journal":{"name":"Conference and Labs of the Evaluation Forum","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Temporal Persistence Using Replicability Measures\",\"authors\":\"Jüri Keller, Timo Breuer, Philipp Schaer\",\"doi\":\"10.48550/arXiv.2308.10549\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In real-world Information Retrieval (IR) experiments, the Evaluation Environment (EE) is exposed to constant change. Documents are added, removed, or updated, and the information need and the search behavior of users is evolving. Simultaneously, IR systems are expected to retain a consistent quality. The LongEval Lab seeks to investigate the longitudinal persistence of IR systems, and in this work, we describe our participation. We submitted runs of five advanced retrieval systems, namely a Reciprocal Rank Fusion (RRF) approach, ColBERT, monoT5, Doc2Query, and E5, to both sub-tasks. Further, we cast the longitudinal evaluation as a replicability study to better understand the temporal change observed. As a result, we quantify the persistence of the submitted runs and see great potential in this evaluation method.\",\"PeriodicalId\":232729,\"journal\":{\"name\":\"Conference and Labs of the Evaluation Forum\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference and Labs of the Evaluation Forum\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2308.10549\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference and Labs of the Evaluation Forum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2308.10549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在现实世界的信息检索(IR)实验中，评价环境(EE)是不断变化的。文档被添加、删除或更新，用户的信息需求和搜索行为也在不断变化。同时，红外系统有望保持一致的质量。LongEval实验室寻求研究红外系统的纵向持久性，在这项工作中，我们描述了我们的参与。我们向两个子任务提交了五种高级检索系统的运行，即互反秩融合(RRF)方法，ColBERT, monoT5, Doc2Query和E5。此外，我们将纵向评估作为可复制性研究，以更好地理解所观察到的时间变化。因此，我们量化了提交的运行的持久性，并看到了这种评估方法的巨大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating Temporal Persistence Using Replicability Measures

In real-world Information Retrieval (IR) experiments, the Evaluation Environment (EE) is exposed to constant change. Documents are added, removed, or updated, and the information need and the search behavior of users is evolving. Simultaneously, IR systems are expected to retain a consistent quality. The LongEval Lab seeks to investigate the longitudinal persistence of IR systems, and in this work, we describe our participation. We submitted runs of five advanced retrieval systems, namely a Reciprocal Rank Fusion (RRF) approach, ColBERT, monoT5, Doc2Query, and E5, to both sub-tasks. Further, we cast the longitudinal evaluation as a replicability study to better understand the temporal change observed. As a result, we quantify the persistence of the submitted runs and see great potential in this evaluation method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Conference and Labs of the Evaluation Forum

自引率

0.00%

发文量