众包地面真相的问题回答使用CrowdTruth

Benjamin Timmermans, Lora Aroyo, Chris Welty
{"title":"众包地面真相的问题回答使用CrowdTruth","authors":"Benjamin Timmermans, Lora Aroyo, Chris Welty","doi":"10.1145/2786451.2786492","DOIUrl":null,"url":null,"abstract":"Gathering training and evaluation data for open domain tasks, such as general question answering, is a challenging task. Typically, ground truth data is provided by human expert annotators, however, in an open domain experts are difficult to define. Moreover, the overall process for annotating examples can be lengthy and expensive. Naturally, crowdsourcing has become a mainstream approach for filling this gap, i.e. gathering human interpretation data. However, similar to the traditional expert annotation tasks, most of those methods use majority voting to measure the quality of the annotations and thus aim at identifying a single right answer for each example, despite the fact that many annotation tasks can have multiple interpretations, which results in multiple correct answers to the same question. We present a crowdsourcing-based approach for efficiently gathering ground truth data called CrowdTruth, where disagreement-based metrics are used to harness the multitude of human interpretation and measure the quality of the resulting ground truth. We exemplify our approach in two semantic interpretation use cases for answering questions.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Crowdsourcing ground truth for Question Answering using CrowdTruth\",\"authors\":\"Benjamin Timmermans, Lora Aroyo, Chris Welty\",\"doi\":\"10.1145/2786451.2786492\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gathering training and evaluation data for open domain tasks, such as general question answering, is a challenging task. Typically, ground truth data is provided by human expert annotators, however, in an open domain experts are difficult to define. Moreover, the overall process for annotating examples can be lengthy and expensive. Naturally, crowdsourcing has become a mainstream approach for filling this gap, i.e. gathering human interpretation data. However, similar to the traditional expert annotation tasks, most of those methods use majority voting to measure the quality of the annotations and thus aim at identifying a single right answer for each example, despite the fact that many annotation tasks can have multiple interpretations, which results in multiple correct answers to the same question. We present a crowdsourcing-based approach for efficiently gathering ground truth data called CrowdTruth, where disagreement-based metrics are used to harness the multitude of human interpretation and measure the quality of the resulting ground truth. We exemplify our approach in two semantic interpretation use cases for answering questions.\",\"PeriodicalId\":93136,\"journal\":{\"name\":\"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2786451.2786492\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2786451.2786492","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

为开放领域任务(如一般问答)收集训练和评估数据是一项具有挑战性的任务。通常,基础真值数据是由人类专家注释者提供的,然而,在开放领域中,专家很难定义。此外,注释示例的整个过程可能很长且代价高昂。自然,众包已经成为填补这一空白的主流方法,即收集人工口译数据。然而,与传统的专家注释任务类似,大多数方法使用多数投票来衡量注释的质量,从而旨在为每个示例确定一个正确答案,尽管许多注释任务可以有多个解释,这导致同一问题有多个正确答案。我们提出了一种基于众包的方法来有效地收集地面真相数据,称为CrowdTruth,其中基于分歧的指标被用来利用大量的人类解释和测量所得地面真相的质量。我们在回答问题的两个语义解释用例中举例说明了我们的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Crowdsourcing ground truth for Question Answering using CrowdTruth
Gathering training and evaluation data for open domain tasks, such as general question answering, is a challenging task. Typically, ground truth data is provided by human expert annotators, however, in an open domain experts are difficult to define. Moreover, the overall process for annotating examples can be lengthy and expensive. Naturally, crowdsourcing has become a mainstream approach for filling this gap, i.e. gathering human interpretation data. However, similar to the traditional expert annotation tasks, most of those methods use majority voting to measure the quality of the annotations and thus aim at identifying a single right answer for each example, despite the fact that many annotation tasks can have multiple interpretations, which results in multiple correct answers to the same question. We present a crowdsourcing-based approach for efficiently gathering ground truth data called CrowdTruth, where disagreement-based metrics are used to harness the multitude of human interpretation and measure the quality of the resulting ground truth. We exemplify our approach in two semantic interpretation use cases for answering questions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信