Hugo Hutt, R. Everson, Murray Grant, John Love, George R. Littlejohn
{"title":"我的图像有多粗糙?评估众包注释任务","authors":"Hugo Hutt, R. Everson, Murray Grant, John Love, George R. Littlejohn","doi":"10.1109/UKCI.2013.6651298","DOIUrl":null,"url":null,"abstract":"The use of citizen science to obtain annotations from multiple annotators has been shown to be an effective method for annotating datasets in which computational methods alone are not feasible. The way in which the annotations are obtained is an important consideration which affects the quality of the resulting consensus estimates. In this paper, we examine three separate approaches to obtaining scores for instances rather than merely classifications. To obtain a consensus score annotators were asked to make annotations in one of three paradigms: classification, scoring and ranking. A web-based citizen science experiment is described which implements the three approaches as crowdsourced annotation tasks. The tasks are evaluated in relation to the accuracy and agreement among the participants using both simulated and real-world data from the experiment. The results show a clear difference in performance between the three tasks, with the ranking task obtaining the highest accuracy and agreement among the participants. We show how a simple evolutionary optimiser may be used to improve the performance by reweighting the importance of annotators.","PeriodicalId":106191,"journal":{"name":"2013 13th UK Workshop on Computational Intelligence (UKCI)","volume":"243 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"How clumpy is my image? Evaluating crowdsourced annotation tasks\",\"authors\":\"Hugo Hutt, R. Everson, Murray Grant, John Love, George R. Littlejohn\",\"doi\":\"10.1109/UKCI.2013.6651298\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of citizen science to obtain annotations from multiple annotators has been shown to be an effective method for annotating datasets in which computational methods alone are not feasible. The way in which the annotations are obtained is an important consideration which affects the quality of the resulting consensus estimates. In this paper, we examine three separate approaches to obtaining scores for instances rather than merely classifications. To obtain a consensus score annotators were asked to make annotations in one of three paradigms: classification, scoring and ranking. A web-based citizen science experiment is described which implements the three approaches as crowdsourced annotation tasks. The tasks are evaluated in relation to the accuracy and agreement among the participants using both simulated and real-world data from the experiment. The results show a clear difference in performance between the three tasks, with the ranking task obtaining the highest accuracy and agreement among the participants. We show how a simple evolutionary optimiser may be used to improve the performance by reweighting the importance of annotators.\",\"PeriodicalId\":106191,\"journal\":{\"name\":\"2013 13th UK Workshop on Computational Intelligence (UKCI)\",\"volume\":\"243 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 13th UK Workshop on Computational Intelligence (UKCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UKCI.2013.6651298\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 13th UK Workshop on Computational Intelligence (UKCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UKCI.2013.6651298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
How clumpy is my image? Evaluating crowdsourced annotation tasks
The use of citizen science to obtain annotations from multiple annotators has been shown to be an effective method for annotating datasets in which computational methods alone are not feasible. The way in which the annotations are obtained is an important consideration which affects the quality of the resulting consensus estimates. In this paper, we examine three separate approaches to obtaining scores for instances rather than merely classifications. To obtain a consensus score annotators were asked to make annotations in one of three paradigms: classification, scoring and ranking. A web-based citizen science experiment is described which implements the three approaches as crowdsourced annotation tasks. The tasks are evaluated in relation to the accuracy and agreement among the participants using both simulated and real-world data from the experiment. The results show a clear difference in performance between the three tasks, with the ranking task obtaining the highest accuracy and agreement among the participants. We show how a simple evolutionary optimiser may be used to improve the performance by reweighting the importance of annotators.