2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)最新文献_第5页

Semantic Guided Latent Parts Embedding for Few-Shot Learning 基于语义引导的少次学习潜件嵌入

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00541

Fengyuan Yang, Ruiping Wang, Xilin Chen

{"title":"Semantic Guided Latent Parts Embedding for Few-Shot Learning","authors":"Fengyuan Yang, Ruiping Wang, Xilin Chen","doi":"10.1109/WACV56688.2023.00541","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00541","url":null,"abstract":"The ability of few-shot learning (FSL) is a basic requirement of intelligent agent learning in the open visual world. However, existing deep learning systems rely too heavily on large numbers of training samples, making it hard to learn new categories efficiently from limited size of training data. Two key challenges of FSL are insufficient comprehension and imperfect modeling of the few-shot novel class. For insufficient visual comprehension, semantic knowledge which is information from other modalities can help replenish the understanding of novel classes. But even so, most works still suffer from the second challenge because the single global class prototype they adopted is extremely unstable and imperfect given the larger intra-class variation and harder inter-class discrimination in FSL scenario. Thus, we propose to represent each class by its several different parts with the help of class semantic knowledge. Since we can never pre-define parts for unknown novel classes, we embed them in a latent manner. Concretely, we train a generator that takes the class semantic knowledge as input and outputs several filters of class-specific semantic latent parts. By applying each part filter, our model can pay attention to corresponding local regions containing each part. At the inference stage, the classification is conducted by comparing the similarities between those parts. Experiments on several FSL benchmarks demonstrate the effectiveness of our proposed method and show its potential to go beyond class recognition to class understanding. Furthermore, we also find when semantic knowledge is more visualized and customized, it will be more helpful in the FSL task.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115154950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Pruning-Guided Curriculum Learning for Semi-Supervised Semantic Segmentation 半监督语义切分的剪枝引导课程学习

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00586

Heejo Kong, Gun-Hee Lee, Suneung Kim, Seonghyeon Lee

{"title":"Pruning-Guided Curriculum Learning for Semi-Supervised Semantic Segmentation","authors":"Heejo Kong, Gun-Hee Lee, Suneung Kim, Seonghyeon Lee","doi":"10.1109/WACV56688.2023.00586","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00586","url":null,"abstract":"This study focuses on improving the quality of pseudolabeling in the context of semi-supervised semantic segmentation. Previous studies have adopted confidence thresholding to reduce erroneous predictions in pseudo-labeled data and to enhance their qualities. However, numerous pseudolabels with high confidence scores exist in the early training stages even though their predictions are incorrect, and this ambiguity limits confidence thresholding substantially. In this paper, we present a novel method to resolve the ambiguity of confidence scores with the guidance of network pruning. A recent finding showed that network pruning severely impairs the network generalization ability on samples that are not yet well learned or represented. Inspired by this finding, we refine the confidence scores by reflecting the extent to which the predictions are affected by pruning. Furthermore, we adopted a curriculum learning strategy for the confidence score, which enables the network to learn gradually from easy to hard samples. This approach resolves the ambiguity by suppressing the learning of noisy pseudolabels, the confidence scores of which are difficult to trust owing to insufficient training in the early stages. Extensive experiments on various benchmarks demonstrate the superiority of our framework over state-of-the-art alternatives.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133206193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Robustness of Trajectory Prediction Models Under Map-Based Attacks 基于地图的攻击下轨迹预测模型的鲁棒性

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00452

Z. Zheng, Xiaowen Ying, Zhen Yao, M. Chuah

{"title":"Robustness of Trajectory Prediction Models Under Map-Based Attacks","authors":"Z. Zheng, Xiaowen Ying, Zhen Yao, M. Chuah","doi":"10.1109/WACV56688.2023.00452","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00452","url":null,"abstract":"Trajectory Prediction (TP) is a critical component in the control system of an Autonomous Vehicle (AV). It predicts future motion of traffic agents based on observations of their past trajectories. Existing works have studied the vulnerability of TP models when the perception systems are under attacks and proposed corresponding mitigation schemes. Recent TP designs have incorporated context map information for performance enhancements. Such designs are subjected to a new type of attacks where an attacker can interfere with these TP models by attacking the context maps. In this paper, we study the robustness of TP models under our newly proposed map-based adversarial attacks. We show that such attacks can compromise state-of-the-art TP models that use either image-based or node-based map representation while keeping the adversarial examples imperceptible. We also demonstrate that our attacks can still be launched under the black-box settings without any knowledge of the TP models running underneath. Our experiments on the NuScene dataset show that the proposed map-based attacks can increase the trajectory prediction errors by 29-110%. Finally, we demonstrate that two defense mechanisms are effective in defending against such map-based attacks.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130385344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Image-Consistent Detection of Road Anomalies as Unpredictable Patches 道路异常不可预测斑块的图像一致性检测

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00545

Tomás Vojír, Jiri Matas

{"title":"Image-Consistent Detection of Road Anomalies as Unpredictable Patches","authors":"Tomás Vojír, Jiri Matas","doi":"10.1109/WACV56688.2023.00545","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00545","url":null,"abstract":"We propose a novel method for anomaly detection primarily aiming at autonomous driving. The design of the method, called DaCUP (Detection of anomalies as Consistent Unpredictable Patches), is based on two general properties of anomalous objects: an anomaly is (i) not from a class that could be modelled and (ii) it is not similar (in appearance) to non-anomalous objects in the image. To this end, we propose a novel embedding bottleneck in an auto-encoder like architecture that enables modelling of a diverse, multi-modal known class appearance (e.g. road). Secondly, we introduce novel image-conditioned distance features that allow known class identification in a nearest-neighbour manner on-the-fly, greatly increasing its ability to distinguish true and false positives. Lastly, an inpainting module is utilized to model the uniqueness of detected anomalies and significantly reduce false positives by filtering regions that are similar, thus reconstructable from their neighbourhood. We demonstrate that filtering of regions based on their similarity to neighbour regions, using e.g. an inpainting module, is general and can be used with other methods for reduction of false positives. The proposed method is evaluated on several publicly available datasets for road anomaly detection and on a maritime benchmark for obstacle avoidance. The method achieves state-of-the-art performance in both tasks with the same hyper-parameters with no domain specific design.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115217894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

MORGAN: Meta-Learning-based Few-Shot Open-Set Recognition via Generative Adversarial Network MORGAN:基于元学习的基于生成对抗网络的少镜头开集识别

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00623

Debabrata Pal, Shirsha Bose, Biplab Banerjee, Y. Jeppu

{"title":"MORGAN: Meta-Learning-based Few-Shot Open-Set Recognition via Generative Adversarial Network","authors":"Debabrata Pal, Shirsha Bose, Biplab Banerjee, Y. Jeppu","doi":"10.1109/WACV56688.2023.00623","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00623","url":null,"abstract":"In few-shot open-set recognition (FSOSR) for hyperspectral images (HSI), one major challenge arises due to the simultaneous presence of spectrally fine-grained known classes and outliers. Prior research on generative FSOSR cannot handle such a situation due to their inability to approximate the open space prudently. To address this issue, we propose a method, Meta-learning-based Open-set Recognition via Generative Adversarial Network (MORGAN), that can learn a finer separation between the closed and the open spaces. MORGAN seeks to generate class-conditioned adversarial samples for both the closed and open spaces in the few-shot regime using two GANs by judiciously tuning noise variance while ensuring discriminability using a novel Anti-Overlap Latent (AOL) regularizer. Adversarial samples from low noise variance amplify known class data density, and we use samples from high noise variance to augment \"known-unknowns\". A first-order episodic strategy is adapted to ensure stability in the GAN training. Finally, we introduce a combination of metric losses which push these augmented \"known-unknowns\" or outliers to disperse in the open space while condensing known class distributions. Extensive experiments on four benchmark HSI datasets indicate that MORGAN achieves state-of-the-art FSOSR performance consistently.1","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115124703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Representation Disentanglement in Generative Models with Contrastive Learning 基于对比学习的生成模型中的表示解缠

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00158

Shentong Mo, Zhun Sun, Chao Li

{"title":"Representation Disentanglement in Generative Models with Contrastive Learning","authors":"Shentong Mo, Zhun Sun, Chao Li","doi":"10.1109/WACV56688.2023.00158","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00158","url":null,"abstract":"Contrastive learning has shown its effectiveness in image classification and generation. Recent works apply contrastive learning to the discriminator of the Generative Adversarial Networks. However, there is little work exploring if contrastive learning can be applied to the encoderdecoder structure to learn disentangled representations. In this work, we propose a simple yet effective method via incorporating contrastive learning into latent optimization, where we name it ContraLORD. Specifically, we first use a generator to learn discriminative and disentangled embeddings via latent optimization. Then an encoder and two momentum encoders are applied to dynamically learn disentangled information across a large number of samples with content-level and residual-level contrastive loss. In the meanwhile, we tune the encoder with the learned embeddings in an amortized manner. We evaluate our approach on ten benchmarks regarding representation disentanglement and linear classification. Extensive experiments demonstrate the effectiveness of our ContraLORD on learning both discriminative and generative representations.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115422818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Sim2real Transfer Learning for Point Cloud Segmentation: An Industrial Application Case on Autonomous Disassembly 基于Sim2real迁移学习的点云分割:一个自主拆卸的工业应用案例

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00451

Chengzhi Wu, Xuelei Bi, Julius Pfrommer, Alexander Cebulla, Simon Mangold, J. Beyerer

引用次数: 1

SHARDS: Efficient SHAdow Removal using Dual Stage Network for High-Resolution Images 碎片:高效的阴影去除使用双阶段网络的高分辨率图像

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00185

Mrinmoy Sen, Sai Pradyumna Chermala, Nazrinbanu Nurmohammad Nagori, V. Peddigari, Praful Mathur, B. H. P. Prasad, Moonsik Jeong

{"title":"SHARDS: Efficient SHAdow Removal using Dual Stage Network for High-Resolution Images","authors":"Mrinmoy Sen, Sai Pradyumna Chermala, Nazrinbanu Nurmohammad Nagori, V. Peddigari, Praful Mathur, B. H. P. Prasad, Moonsik Jeong","doi":"10.1109/WACV56688.2023.00185","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00185","url":null,"abstract":"Shadow Removal is an important and widely researched topic in computer vision. Recent advances in deep learning have resulted in addressing this problem by using convolutional neural networks (CNNs) similar to other vision tasks. But these existing works are limited to low-resolution images. Furthermore, the existing methods rely on heavy network architectures which cannot be deployed on resource-constrained platforms like smartphones. In this paper, we propose SHARDS, a shadow removal method for high-resolution images. The proposed method solves shadow removal for high-resolution images in two stages using two lightweight networks: a Low-resolution Shadow Removal Network (LSRNet) followed by a Detail Refinement Network (DRNet). LSRNet operates at low-resolution and computes a low-resolution, shadow-free output. It achieves state-of-the-art results on standard datasets with 65x lesser network parameters than existing methods. This is followed by DRNet, which is tasked to refine the low-resolution output to a high-resolution output using the high-resolution input shadow image as guidance. We construct high-resolution shadow removal datasets and through our experiments, prove the effectiveness of our proposed method on them. It is then demonstrated that this method can be deployed on modern day smartphones and is the first of its kind solution that can efficiently (2.4secs) perform shadow removal for high-resolution images (12MP) in these devices. Like many existing approaches, our shadow removal network relies on a shadow region mask as input to the network. To complement the lightweight shadow removal network, we also propose a lightweight shadow detector in this paper.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125031012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Expert-defined Keywords Improve Interpretability of Retinal Image Captioning 专家定义关键词提高视网膜图像标题的可解释性

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00190

Ting-Wei Wu, Jia-Hong Huang, Joseph Lin, M. Worring

{"title":"Expert-defined Keywords Improve Interpretability of Retinal Image Captioning","authors":"Ting-Wei Wu, Jia-Hong Huang, Joseph Lin, M. Worring","doi":"10.1109/WACV56688.2023.00190","DOIUrl":"https://doi.org/10.1109/WACV56688.2023.00190","url":null,"abstract":"Automatic machine learning-based (ML-based) medical report generation systems for retinal images suffer from a relative lack of interpretability. Hence, such ML-based systems are still not widely accepted. The main reason is that trust is one of the important motivating aspects of interpretability and humans do not trust blindly. Precise technical definitions of interpretability still lack consensus. Hence, it is difficult to make a human-comprehensible ML-based medical report generation system. Heat maps/saliency maps, i.e., post-hoc explanation approaches, are widely used to improve the interpretability of ML-based medical systems. However, they are well known to be problematic. From an ML-based medical model’s perspective, the highlighted areas of an image are considered important for making a prediction. However, from a doctor’s perspective, even the hottest regions of a heat map contain both useful and non-useful information. Simply localizing the region, therefore, does not reveal exactly what it was in that area that the model considered useful. Hence, the post-hoc explanation-based method relies on humans who probably have a biased nature to decide what a given heat map might mean. Interpretability boosters, in particular expert-defined keywords, are effective carriers of expert domain knowledge and they are human-comprehensible. In this work, we propose to exploit such keywords and a specialized attention-based strategy to build a more human-comprehensible medical report generation system for retinal images. Both keywords and the proposed strategy effectively improve the interpretability. The proposed method achieves state-of-the-art performance under commonly used text evaluation metrics BLEU, ROUGE, CIDEr, and METEOR. Project website: https://github.com/Jhhuangkay/Expert-defined-Keywords-Improve-Interpretability-of-Retinal-Image-Captioning.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124015723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Learning Lightweight Neural Networks via Channel-Split Recurrent Convolution 通过通道分裂递归卷积学习轻量级神经网络

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2023-01-01 DOI: 10.1109/WACV56688.2023.00385

Guojun Wu, Xin Zhang, Ziming Zhang, Yanhua Li, Xun Zhou, Christopher G. Brinton, Zhenming Liu

引用次数: 0