On the Properties of Evaluation Metrics for Finding One Highly Relevant Document

T. Sakai
{"title":"On the Properties of Evaluation Metrics for Finding One Highly Relevant Document","authors":"T. Sakai","doi":"10.2197/IPSJDC.3.643","DOIUrl":null,"url":null,"abstract":"Traditional information retrieval evaluation relies on both precision and recall. However, modern search environments such as the Web, in which recall is either unimportant or immeasurable, require precision-oriented evaluation. In particular, finding one highly relevant document is very important for practical tasks such as known-item search and suspected-item search. This paper compares the properties of five evaluation metrics that are applicable to the task of finding one highly relevant document in terms of the underlying assumptions, how the system rankings produced resemble each other, and discriminative power. We employ two existing methods for comparing the discriminative power of these metrics: The Swap Method proposed by Voorhees and Buckley at ACM SIGIR 2002, and the Bootstrap Sensitivity Method proposed by Sakai at SIGIR 2006. We use four data sets from NTCIR to show that, while P(+)-measure, O-measure and NWRR (Normalised Weighted Reciprocal Rank)are reasonably highly correlated to one another, P(+)-measure and O-measure are more discriminative than NWRR, which in turn is more discriminative than Reciprocal Rank. We therefore conclude that P(+)-measure and O-measure, each modelling a different user behaviour, are the most useful evaluation metrics for the task of finding one highly relevant document.","PeriodicalId":432390,"journal":{"name":"Ipsj Digital Courier","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ipsj Digital Courier","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/IPSJDC.3.643","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Traditional information retrieval evaluation relies on both precision and recall. However, modern search environments such as the Web, in which recall is either unimportant or immeasurable, require precision-oriented evaluation. In particular, finding one highly relevant document is very important for practical tasks such as known-item search and suspected-item search. This paper compares the properties of five evaluation metrics that are applicable to the task of finding one highly relevant document in terms of the underlying assumptions, how the system rankings produced resemble each other, and discriminative power. We employ two existing methods for comparing the discriminative power of these metrics: The Swap Method proposed by Voorhees and Buckley at ACM SIGIR 2002, and the Bootstrap Sensitivity Method proposed by Sakai at SIGIR 2006. We use four data sets from NTCIR to show that, while P(+)-measure, O-measure and NWRR (Normalised Weighted Reciprocal Rank)are reasonably highly correlated to one another, P(+)-measure and O-measure are more discriminative than NWRR, which in turn is more discriminative than Reciprocal Rank. We therefore conclude that P(+)-measure and O-measure, each modelling a different user behaviour, are the most useful evaluation metrics for the task of finding one highly relevant document.
高度相关文献检索评价指标的性质研究
传统的信息检索评价既依赖于准确率,又依赖于召回率。然而,像Web这样的现代搜索环境中,召回率要么不重要,要么无法测量,因此需要以精确度为导向的评估。特别是,查找高度相关的文档对于诸如已知项搜索和可疑项搜索等实际任务非常重要。本文比较了五个评估指标的属性,这些指标适用于寻找一个高度相关的文件的任务,包括潜在的假设,系统排名如何产生彼此相似,以及判别能力。我们采用了两种现有的方法来比较这些指标的判别能力:Voorhees和Buckley在ACM SIGIR 2002上提出的Swap方法,以及Sakai在SIGIR 2006上提出的Bootstrap灵敏度方法。我们使用来自NTCIR的四个数据集来表明,虽然P(+)-测度、o -测度和NWRR(归一化加权倒数秩)彼此之间具有相当高的相关性,但P(+)-测度和o -测度比NWRR更具判别性,而NWRR又比倒数秩更具判别性。因此,我们得出结论,P(+)-度量和o -度量(每个度量都模拟不同的用户行为)是寻找高度相关文档的任务中最有用的评估度量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信