Debiased Label Aggregation for Subjective Crowdsourcing Tasks

S. Wallace, Tianyuan Cai, Brendan Le, Luis A Leiva
{"title":"Debiased Label Aggregation for Subjective Crowdsourcing Tasks","authors":"S. Wallace, Tianyuan Cai, Brendan Le, Luis A Leiva","doi":"10.1145/3491101.3519614","DOIUrl":null,"url":null,"abstract":"Human Intelligence Tasks (HITs) allow people to collect and curate labeled data from multiple annotators. Then labels are often aggregated to create an annotated dataset suitable for supervised machine learning tasks. The most popular label aggregation method is majority voting, where each item in the dataset is assigned the most common label from the annotators. This approach is optimal when annotators are unbiased domain experts. In this paper, we propose Debiased Label Aggregation (DLA) an alternative method for label aggregation in subjective HITs, where cross-annotator agreement varies. DLA leverages user voting behavior patterns to weight labels. Our experiments show that DLA outperforms majority voting in several performance metrics; e.g. a percentage increase of 20 points in the F1 measure before data augmentation, and a percentage increase of 35 points in the same measure after data augmentation. Since DLA is deceptively simple, we hope it will help researchers to tackle subjective labeling tasks.","PeriodicalId":123301,"journal":{"name":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3491101.3519614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Human Intelligence Tasks (HITs) allow people to collect and curate labeled data from multiple annotators. Then labels are often aggregated to create an annotated dataset suitable for supervised machine learning tasks. The most popular label aggregation method is majority voting, where each item in the dataset is assigned the most common label from the annotators. This approach is optimal when annotators are unbiased domain experts. In this paper, we propose Debiased Label Aggregation (DLA) an alternative method for label aggregation in subjective HITs, where cross-annotator agreement varies. DLA leverages user voting behavior patterns to weight labels. Our experiments show that DLA outperforms majority voting in several performance metrics; e.g. a percentage increase of 20 points in the F1 measure before data augmentation, and a percentage increase of 35 points in the same measure after data augmentation. Since DLA is deceptively simple, we hope it will help researchers to tackle subjective labeling tasks.
主观众包任务的无偏见标签聚合
人类智能任务(hit)允许人们收集和管理来自多个注释者的标记数据。然后通常将标签聚合以创建适合监督机器学习任务的带注释的数据集。最流行的标签聚合方法是多数投票,在这种方法中,数据集中的每个项目从注释者那里获得最常见的标签。当注释者是无偏见的领域专家时,这种方法是最佳的。在本文中,我们提出了一种用于主观HITs中标签聚合的替代方法,其中跨注释者的一致性不同。DLA利用用户投票行为模式来加权标签。我们的实验表明,DLA在几个性能指标上优于多数投票;例如,在数据增强之前,F1措施增加了20个百分点,在数据增强之后,同一措施增加了35个百分点。由于DLA看似简单,我们希望它能帮助研究人员解决主观标签任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信