Yuang Deng, Yuhang Zhang, Wenrui Dai, Xiaopeng Zhang, H. Xiong
{"title":"有效目标检测的弱监督区域级对比学习","authors":"Yuang Deng, Yuhang Zhang, Wenrui Dai, Xiaopeng Zhang, H. Xiong","doi":"10.1109/VCIP56404.2022.10008827","DOIUrl":null,"url":null,"abstract":"Semi-supervised learning, which assigns pseudo labels with models trained using limited labeled data, has been widely used in object detection to reduce the labeling cost. However, the provided pseudo annotations inevitably suffer noise since the initial model is not perfect. To address this issue, this paper introduces contrastive learning into semi-supervised object detection, and we claim that contrastive loss, which inherently relies on data augmentations, is much more robust than traditional softmax regression for noisy labels. To take full advantage of it in the detection task, we incorporate labels prior to contrastive loss and leverage plenty of region proposals to enhance diversity, which is crucial for contrastive learning. In this way, the model is optimized to make the region-level features with the same class be translation and scale invariant. Furthermore, we redesign the negative memory bank in contrastive learning to make the training more efficient. As far as we know, we are the first attempt that introduces contrastive learning in semi-supervised object detection. Experimental results on detection benchmarks demonstrate the superiority of our method. Notably, our method achieves 79.9% accuracy on VOC, which is 6.2% better than the supervised baseline and 0.7% improvement compared with the state-of-the-art method.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Weakly Supervised Region-Level Contrastive Learning for Efficient Object Detection\",\"authors\":\"Yuang Deng, Yuhang Zhang, Wenrui Dai, Xiaopeng Zhang, H. Xiong\",\"doi\":\"10.1109/VCIP56404.2022.10008827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semi-supervised learning, which assigns pseudo labels with models trained using limited labeled data, has been widely used in object detection to reduce the labeling cost. However, the provided pseudo annotations inevitably suffer noise since the initial model is not perfect. To address this issue, this paper introduces contrastive learning into semi-supervised object detection, and we claim that contrastive loss, which inherently relies on data augmentations, is much more robust than traditional softmax regression for noisy labels. To take full advantage of it in the detection task, we incorporate labels prior to contrastive loss and leverage plenty of region proposals to enhance diversity, which is crucial for contrastive learning. In this way, the model is optimized to make the region-level features with the same class be translation and scale invariant. Furthermore, we redesign the negative memory bank in contrastive learning to make the training more efficient. As far as we know, we are the first attempt that introduces contrastive learning in semi-supervised object detection. Experimental results on detection benchmarks demonstrate the superiority of our method. Notably, our method achieves 79.9% accuracy on VOC, which is 6.2% better than the supervised baseline and 0.7% improvement compared with the state-of-the-art method.\",\"PeriodicalId\":269379,\"journal\":{\"name\":\"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP56404.2022.10008827\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008827","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Weakly Supervised Region-Level Contrastive Learning for Efficient Object Detection
Semi-supervised learning, which assigns pseudo labels with models trained using limited labeled data, has been widely used in object detection to reduce the labeling cost. However, the provided pseudo annotations inevitably suffer noise since the initial model is not perfect. To address this issue, this paper introduces contrastive learning into semi-supervised object detection, and we claim that contrastive loss, which inherently relies on data augmentations, is much more robust than traditional softmax regression for noisy labels. To take full advantage of it in the detection task, we incorporate labels prior to contrastive loss and leverage plenty of region proposals to enhance diversity, which is crucial for contrastive learning. In this way, the model is optimized to make the region-level features with the same class be translation and scale invariant. Furthermore, we redesign the negative memory bank in contrastive learning to make the training more efficient. As far as we know, we are the first attempt that introduces contrastive learning in semi-supervised object detection. Experimental results on detection benchmarks demonstrate the superiority of our method. Notably, our method achieves 79.9% accuracy on VOC, which is 6.2% better than the supervised baseline and 0.7% improvement compared with the state-of-the-art method.