{"title":"一种面向视觉目标检测的众包重复标注系统","authors":"Yucheng Hu, Zhonghong Ou, Xiangyu Xu, Meina Song","doi":"10.1145/3387168.3387242","DOIUrl":null,"url":null,"abstract":"As a fundamental task in compute vision, object detection has been developed rapidly driven by the deep learning. The lack of a large number of images with ground truth annotations has become a chief obstacle to object detection applications in many fields. Eliciting labels from crowds is a potential way to obtain large labeled data. Nonetheless, existing crowdsourced techniques, e.g., Amazon Mechanical Turk (MTurk), often fail to guarantee the quality of the annotations, which have a bad influence on the accuracy of the deep detector. A variety of methods have been developed for ground truth inference and learning from crowds. In this paper, we study strategies to crowd-source repeated labels in support for these methods. The core challenge of building such a system is to reduce the difficulty to annotate multiple objects of interest and improve the data quality as much as possible. We present a system that adopts the turn-based annotation mechanism and consists of three simple sub-tasks: a single object annotation, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate and can assist the detector of obtaining higher accuracy.","PeriodicalId":346739,"journal":{"name":"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Crowdsourcing Repeated Annotations System for Visual Object Detection\",\"authors\":\"Yucheng Hu, Zhonghong Ou, Xiangyu Xu, Meina Song\",\"doi\":\"10.1145/3387168.3387242\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a fundamental task in compute vision, object detection has been developed rapidly driven by the deep learning. The lack of a large number of images with ground truth annotations has become a chief obstacle to object detection applications in many fields. Eliciting labels from crowds is a potential way to obtain large labeled data. Nonetheless, existing crowdsourced techniques, e.g., Amazon Mechanical Turk (MTurk), often fail to guarantee the quality of the annotations, which have a bad influence on the accuracy of the deep detector. A variety of methods have been developed for ground truth inference and learning from crowds. In this paper, we study strategies to crowd-source repeated labels in support for these methods. The core challenge of building such a system is to reduce the difficulty to annotate multiple objects of interest and improve the data quality as much as possible. We present a system that adopts the turn-based annotation mechanism and consists of three simple sub-tasks: a single object annotation, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate and can assist the detector of obtaining higher accuracy.\",\"PeriodicalId\":346739,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3387168.3387242\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Vision, Image and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3387168.3387242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
摘要
目标检测作为计算视觉的一项基础任务,在深度学习的推动下得到了迅速发展。缺乏大量具有地面真值注释的图像已成为许多领域目标检测应用的主要障碍。从人群中提取标签是获得大量标记数据的潜在方法。然而,现有的众包技术,如Amazon Mechanical Turk (MTurk),往往不能保证注释的质量,这对深度检测器的准确性产生了不好的影响。已经开发了各种方法用于基础真值推断和从群体中学习。在本文中,我们研究了支持这些方法的重复标签众包策略。构建这样一个系统的核心挑战是尽可能降低标注多个感兴趣对象的难度,并提高数据质量。我们提出了一个采用回合制标注机制的系统,该系统由三个简单的子任务组成:单对象标注、质量验证任务和覆盖率验证任务。实验结果表明,该系统具有良好的可扩展性和准确性,可以帮助检测器获得更高的精度。
A Crowdsourcing Repeated Annotations System for Visual Object Detection
As a fundamental task in compute vision, object detection has been developed rapidly driven by the deep learning. The lack of a large number of images with ground truth annotations has become a chief obstacle to object detection applications in many fields. Eliciting labels from crowds is a potential way to obtain large labeled data. Nonetheless, existing crowdsourced techniques, e.g., Amazon Mechanical Turk (MTurk), often fail to guarantee the quality of the annotations, which have a bad influence on the accuracy of the deep detector. A variety of methods have been developed for ground truth inference and learning from crowds. In this paper, we study strategies to crowd-source repeated labels in support for these methods. The core challenge of building such a system is to reduce the difficulty to annotate multiple objects of interest and improve the data quality as much as possible. We present a system that adopts the turn-based annotation mechanism and consists of three simple sub-tasks: a single object annotation, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate and can assist the detector of obtaining higher accuracy.