{"title":"The Influence of Topic Difficulty, Relevance Level, and Document Ordering on Relevance Judging","authors":"T. T. Damessie, Falk Scholer, J. Culpepper","doi":"10.1145/3015022.3015033","DOIUrl":null,"url":null,"abstract":"Judging the relevance of documents for an information need is an activity that underpins the most widely-used approach in the evaluation of information retrieval systems. In this study we investigate the relationship between how long it takes an assessor to judge document relevance, and three key factors that may influence the judging scenario: the difficulty of the search topic for which relevance is being assessed; the degree to which the documents are relevant to the search topic; and, the order in which the documents are presented for judging. Two potential confounding influences on judgment speed are differences in individual reading ability, and the length of documents that are being assessed. We therefore propose two measures to investigate the above factors: normalized processing speed (NPS), which adjusts the number of words that were processed per minute by taking into account differences in reading speed between judges, and normalized dwell time (NDT), which adjusts the duration that a judge spent reading a document relative to document length. Note that these two measures have different relationships with overall judgment speed: a direct relationship for NPS, and an inverse relationship for NDT. The results of a small-scale user study show a statistically significant relationship between judgment speed and topic difficulty: for easier topics, assessors process more quickly (higher NPS), and spend less time overall (lower NDT). There is also a statistically significant relationship between the level of relevance of the document being assessed and overall judgment speed, with assessors taking less time for non-relevant documents. Finally, our results suggest that the presentation order of documents can also affect overall judgment speed, with assessors spending less time (smaller NDT) when documents are presented in relevance order than docID order. However, these ordering effects are not significant when also accounting for document length variance (NPS).","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st Australasian Document Computing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3015022.3015033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Judging the relevance of documents for an information need is an activity that underpins the most widely-used approach in the evaluation of information retrieval systems. In this study we investigate the relationship between how long it takes an assessor to judge document relevance, and three key factors that may influence the judging scenario: the difficulty of the search topic for which relevance is being assessed; the degree to which the documents are relevant to the search topic; and, the order in which the documents are presented for judging. Two potential confounding influences on judgment speed are differences in individual reading ability, and the length of documents that are being assessed. We therefore propose two measures to investigate the above factors: normalized processing speed (NPS), which adjusts the number of words that were processed per minute by taking into account differences in reading speed between judges, and normalized dwell time (NDT), which adjusts the duration that a judge spent reading a document relative to document length. Note that these two measures have different relationships with overall judgment speed: a direct relationship for NPS, and an inverse relationship for NDT. The results of a small-scale user study show a statistically significant relationship between judgment speed and topic difficulty: for easier topics, assessors process more quickly (higher NPS), and spend less time overall (lower NDT). There is also a statistically significant relationship between the level of relevance of the document being assessed and overall judgment speed, with assessors taking less time for non-relevant documents. Finally, our results suggest that the presentation order of documents can also affect overall judgment speed, with assessors spending less time (smaller NDT) when documents are presented in relevance order than docID order. However, these ordering effects are not significant when also accounting for document length variance (NPS).