{"title":"Focus Annotation of Task-based Data: Establishing the Quality of Crowd Annotation","authors":"K. Kuthy, Ramon Ziai, Walt Detmar Meurers","doi":"10.18653/v1/W16-1713","DOIUrl":null,"url":null,"abstract":"We explore the annotation of information structure in German and compare the quality of expert annotation with crowdsourced annotation taking into account the cost of reaching crowd consensus. Concretely, we discuss a crowd-sourcing effort annotating focus in a task-based corpus of German containing reading comprehension questions and answers. Against the backdrop of a gold standard reference resulting from adjudicated expert annotation, we evaluate a crowd sourcing experiment using majority voting to determine a baseline performance. To refine the crowd-sourcing setup, we introduce the Consensus Cost as a measure of agreement within the crowd. We investigate the usefulness of Consensus Cost as a measure of crowd annotation quality both intrinsically, in relation to the expert gold standard, and extrinsically, by integrating focus annotation information into a system performing Short Answer Assessment taking into account the Consensus Cost. We find that low Consensus Cost in crowd sourcing indicates high quality, though high cost does not necessarily indicate low accuracy but increased variability. Overall, taking Consensus Cost into account improves both intrinsic and extrinsic evaluation measures.","PeriodicalId":150065,"journal":{"name":"Proceedings of the 10th Linguistic Annotation Workshop held in\n conjunction with ACL 2016 (LAW-X 2016)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th Linguistic Annotation Workshop held in\n conjunction with ACL 2016 (LAW-X 2016)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W16-1713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
We explore the annotation of information structure in German and compare the quality of expert annotation with crowdsourced annotation taking into account the cost of reaching crowd consensus. Concretely, we discuss a crowd-sourcing effort annotating focus in a task-based corpus of German containing reading comprehension questions and answers. Against the backdrop of a gold standard reference resulting from adjudicated expert annotation, we evaluate a crowd sourcing experiment using majority voting to determine a baseline performance. To refine the crowd-sourcing setup, we introduce the Consensus Cost as a measure of agreement within the crowd. We investigate the usefulness of Consensus Cost as a measure of crowd annotation quality both intrinsically, in relation to the expert gold standard, and extrinsically, by integrating focus annotation information into a system performing Short Answer Assessment taking into account the Consensus Cost. We find that low Consensus Cost in crowd sourcing indicates high quality, though high cost does not necessarily indicate low accuracy but increased variability. Overall, taking Consensus Cost into account improves both intrinsic and extrinsic evaluation measures.