{"title":"Effect of Label Redundancy in Crowdsourcing for Training Machine Learning Models","authors":"Ayame Shimizu, Kei Wakabayashi","doi":"10.26421/jdi3.3-1","DOIUrl":null,"url":null,"abstract":"Crowdsourcing is widely utilized for collecting labeled examples to train supervised machine learning models, but the labels obtained from workers are considerably noisier than those from expert annotators. To address the noisy label issue, most researchers adopt the repeated labeling strategy, where multiple (redundant) labels are collected for each example and then aggregated. Although this improves the annotation quality, it decreases the amount of training data when the budget for crowdsourcing is limited, which is a negative factor in terms of the accuracy of the machine learning model to be trained. This paper empirically examines the extent to which repeated labeling contributes to the accuracy of machine learning models for image classification, named entity recognition and sentiment analysis under various conditions of budget and worker quality. We experimentally examined four hypotheses related to the effect of budget, worker quality, task difficulty, and redundancy on crowdsourcing. The results on image classification and named entity recognition supported all four hypotheses and suggested that repeated labeling almost always has a negative impact on machine learning when it comes to accuracy. Somewhat surprisingly, the results on sentiment analysis using pretrained models did not support the hypothesis which shows the possibility of remaining utilization of multiple-labeling.","PeriodicalId":232625,"journal":{"name":"J. Data Intell.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Data Intell.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26421/jdi3.3-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Crowdsourcing is widely utilized for collecting labeled examples to train supervised machine learning models, but the labels obtained from workers are considerably noisier than those from expert annotators. To address the noisy label issue, most researchers adopt the repeated labeling strategy, where multiple (redundant) labels are collected for each example and then aggregated. Although this improves the annotation quality, it decreases the amount of training data when the budget for crowdsourcing is limited, which is a negative factor in terms of the accuracy of the machine learning model to be trained. This paper empirically examines the extent to which repeated labeling contributes to the accuracy of machine learning models for image classification, named entity recognition and sentiment analysis under various conditions of budget and worker quality. We experimentally examined four hypotheses related to the effect of budget, worker quality, task difficulty, and redundancy on crowdsourcing. The results on image classification and named entity recognition supported all four hypotheses and suggested that repeated labeling almost always has a negative impact on machine learning when it comes to accuracy. Somewhat surprisingly, the results on sentiment analysis using pretrained models did not support the hypothesis which shows the possibility of remaining utilization of multiple-labeling.