The Effect of Document Order and Topic Difficulty on Assessor Agreement

Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval Pub Date : 2016-09-12 DOI:10.1145/2970398.2970431

T. T. Damessie, Falk Scholer, K. Järvelin, J. Culpepper

引用次数: 9

Abstract

Human relevance judgments are a key component for measuring the effectiveness of information retrieval systems using test collections. Since relevance is not an absolute concept, human assessors can disagree on particular topic-document pairs for a variety of reasons. In this work we investigate the effect that document presentation order has on inter-rater agreement, comparing two presentation ordering approaches similar to those used in IR evaluation campaigns: decreasing relevance order and document identifier order. We make a further distinction between "easy" topics and "hard" topics in order to explore system effects on inter-rater agreement. The results of our pilot user study indicate that assessor agreement is higher when documents are judged in document identifier order. In addition, there is higher overall agreement on easy topics than on hard topics.

查看原文本刊更多论文

文件顺序和主题难度对评价者协议的影响

人类相关性判断是衡量使用测试集合的信息检索系统有效性的关键组成部分。由于相关性不是一个绝对的概念，人类评估人员可能会因为各种原因对特定的主题-文档对产生分歧。在这项工作中，我们研究了文档呈现顺序对评分者间协议的影响，比较了两种类似于IR评估活动中使用的呈现顺序方法:递减相关性顺序和文档标识符顺序。我们进一步区分了“容易”话题和“难”话题，以探索制度对评分者间协议的影响。我们的试点用户研究结果表明，当文档以文档标识符顺序判断时，评估员的一致性更高。此外，在简单话题上的总体一致性高于在困难话题上的一致性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval

自引率

0.00%

发文量