Measuring Quality of Workers by Goodness-of-Fit of Machine Learning Model in Crowdsourcing

Yumiko Suzuki
{"title":"Measuring Quality of Workers by Goodness-of-Fit of Machine Learning Model in Crowdsourcing","authors":"Yumiko Suzuki","doi":"10.1145/3472163.3472279","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a method for predicting the quality of crowdsourcing workers using the goodness-of-fit (GoF) of machine learning models. We assume a relationship between the quality of workers and the quality of machine-learning models using the outcomes of the workers as training data. This assumption means that if worker quality is high, a machine-learning classifier constructed using the worker’s outcomes can easily predict the outcomes of the worker. If this assumption is confirmed, we can measure the worker quality without using the correct answer sets, and then the requesters can reduce the time and effort. However, if the outcomes by workers are low quality, the input tweet does not correspond to the outcomes. Therefore, if we construct a tweet classifier using input tweets and the classified results by the worker, the prediction of the outcomes by the classifier and that by the workers should differ. We assume that the GoF scores, such as accuracy and F1 scores of the test set using this classifier, correlates to worker quality. Therefore, we can predict worker quality using the GoF scores. In our experiment, we did the tweet classification task using crowdsourcing. We confirmed that the GoF scores and the quality of workers correlate. These results show that we can predict the quality of workers using the GoF scores.","PeriodicalId":242683,"journal":{"name":"Proceedings of the 25th International Database Engineering & Applications Symposium","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Database Engineering & Applications Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3472163.3472279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we propose a method for predicting the quality of crowdsourcing workers using the goodness-of-fit (GoF) of machine learning models. We assume a relationship between the quality of workers and the quality of machine-learning models using the outcomes of the workers as training data. This assumption means that if worker quality is high, a machine-learning classifier constructed using the worker’s outcomes can easily predict the outcomes of the worker. If this assumption is confirmed, we can measure the worker quality without using the correct answer sets, and then the requesters can reduce the time and effort. However, if the outcomes by workers are low quality, the input tweet does not correspond to the outcomes. Therefore, if we construct a tweet classifier using input tweets and the classified results by the worker, the prediction of the outcomes by the classifier and that by the workers should differ. We assume that the GoF scores, such as accuracy and F1 scores of the test set using this classifier, correlates to worker quality. Therefore, we can predict worker quality using the GoF scores. In our experiment, we did the tweet classification task using crowdsourcing. We confirmed that the GoF scores and the quality of workers correlate. These results show that we can predict the quality of workers using the GoF scores.
用众包中机器学习模型的拟合优度衡量员工素质
在本文中,我们提出了一种使用机器学习模型的拟合优度(GoF)来预测众包工人质量的方法。我们使用工人的结果作为训练数据,假设工人的质量和机器学习模型的质量之间存在关系。这个假设意味着,如果工人的素质很高,使用工人的结果构建的机器学习分类器可以很容易地预测工人的结果。如果这个假设得到证实,我们可以在不使用正确答案集的情况下衡量工作人员的质量,然后请求者可以减少时间和精力。但是,如果工作人员的结果是低质量的,则输入的tweet与结果不对应。因此,如果我们使用输入的推文和工作人员的分类结果构建推文分类器,分类器对结果的预测和工作人员的预测应该不同。我们假设GoF分数,例如使用此分类器的测试集的准确性和F1分数,与工人质量相关。因此,我们可以使用GoF分数来预测工人的素质。在我们的实验中,我们使用众包来完成tweet分类任务。我们证实了GoF分数和工人的素质是相关的。这些结果表明,我们可以使用GoF分数来预测工人的素质。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信