Crowdsourcing with Self-paced Workers

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI:10.1109/ICDM51629.2021.00038

Xiangping Kang, Guoxian Yu, C. Domeniconi, Jun Wang, Weicong Guo, Yazhou Ren, Lili Cui

{"title":"Crowdsourcing with Self-paced Workers","authors":"Xiangping Kang, Guoxian Yu, C. Domeniconi, Jun Wang, Weicong Guo, Yazhou Ren, Lili Cui","doi":"10.1109/ICDM51629.2021.00038","DOIUrl":null,"url":null,"abstract":"Crowdsourcing is a popular and relatively economic way to harness human intelligence to process computer-hard tasks. Due to diverse factors (i.e., task difficulty, worker capability, and incentives), the collected answers from various crowd workers are of different quality. Many approaches have been proposed to manage high quality answers and to reduce the budget by modelling tasks, workers, or both. However, most of the existing approaches implicitly assume that the capability of workers is fixed during the crowdsourcing process. But in practice, such capability can be improved by gradually completing easy to hard tasks, alike human beings’ intrinsic self-paced learning ability. In this paper, we investigate crowdsourcing with self-paced workers, whose capability can be gradually boosted as he/she scrutinises and completes easy to hard tasks. Our proposed SPCrowd (Self-Paced Crowd worker) first asks workers to complete a set of golden tasks with known annotations; provides feedback to assist workers with capturing the raw modes of tasks and to spark the self-paced learning, which in turn facilitates the estimation of workers’ quality and tasks’ difficulty. It then introduces a task difficulty model to quantify the difficulty of tasks and rank them from easy to hard, and a benefit maximization criterion for task assignment, which can dynamically monitor the quality of self-paced workers and assign the sorted tasks to capable workers. In this way, a worker can successfully complete hard tasks after he/she completes easier and related tasks. Experimental results on semi-simulated and real crowdsourcing projects show that SPCrowd can better control the quality and save the budget compared to competitive baselines.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Crowdsourcing is a popular and relatively economic way to harness human intelligence to process computer-hard tasks. Due to diverse factors (i.e., task difficulty, worker capability, and incentives), the collected answers from various crowd workers are of different quality. Many approaches have been proposed to manage high quality answers and to reduce the budget by modelling tasks, workers, or both. However, most of the existing approaches implicitly assume that the capability of workers is fixed during the crowdsourcing process. But in practice, such capability can be improved by gradually completing easy to hard tasks, alike human beings’ intrinsic self-paced learning ability. In this paper, we investigate crowdsourcing with self-paced workers, whose capability can be gradually boosted as he/she scrutinises and completes easy to hard tasks. Our proposed SPCrowd (Self-Paced Crowd worker) first asks workers to complete a set of golden tasks with known annotations; provides feedback to assist workers with capturing the raw modes of tasks and to spark the self-paced learning, which in turn facilitates the estimation of workers’ quality and tasks’ difficulty. It then introduces a task difficulty model to quantify the difficulty of tasks and rank them from easy to hard, and a benefit maximization criterion for task assignment, which can dynamically monitor the quality of self-paced workers and assign the sorted tasks to capable workers. In this way, a worker can successfully complete hard tasks after he/she completes easier and related tasks. Experimental results on semi-simulated and real crowdsourcing projects show that SPCrowd can better control the quality and save the budget compared to competitive baselines.

查看原文本刊更多论文

使用自定进度的员工进行众包

众包是一种流行的、相对经济的方式，它利用人类的智慧来处理计算机难以完成的任务。由于任务难度、工作者能力、激励等因素的不同，收集到的各类人群工作者的回答质量参差不齐。已经提出了许多方法来管理高质量的答案，并通过建模任务、工人或两者来减少预算。然而，大多数现有的方法都隐含地假设工人的能力在众包过程中是固定的。但在实践中，这种能力可以通过逐步完成容易到困难的任务来提高，就像人类固有的自定节奏学习能力一样。在本文中，我们研究了自定进度工人的众包，他们的能力可以随着他/她审查和完成易到难的任务而逐渐提高。我们提出的SPCrowd (self - pace Crowd worker)首先要求员工完成一组带有已知注释的黄金任务;提供反馈，帮助员工捕捉任务的原始模式，激发自定进度的学习，从而促进对员工素质和任务难度的评估。在此基础上，引入了任务难度模型，量化了任务的难度，并对任务进行了由易到难的排序;引入了任务分配的利益最大化准则，可以动态监控自定进度工人的工作质量，并将排序后的任务分配给有能力的工人。通过这种方式，工人可以在完成较容易和相关的任务后成功完成较困难的任务。半模拟和真实众包项目的实验结果表明，与竞争基线相比，SPCrowd可以更好地控制质量和节省预算。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Data Mining (ICDM)

自引率

0.00%

发文量