2021 IEEE/CVF International Conference on Computer Vision (ICCV)最新文献_第4页

Weakly Supervised Segmentation of Small Buildings with Point Labels 带点标签的小建筑物弱监督分割

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00731

Jae-Hun Lee, ChanYoung Kim, S. Sull

引用次数: 8

Just a Few Points are All You Need for Multi-view Stereo: A Novel Semi-supervised Learning Method for Multi-view Stereo 多视点立体:一种新的半监督式多视点立体学习方法

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00612

Taekyung Kim, Jaehoon Choi, Seokeon Choi, Dongki Jung, Changick Kim

{"title":"Just a Few Points are All You Need for Multi-view Stereo: A Novel Semi-supervised Learning Method for Multi-view Stereo","authors":"Taekyung Kim, Jaehoon Choi, Seokeon Choi, Dongki Jung, Changick Kim","doi":"10.1109/ICCV48922.2021.00612","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00612","url":null,"abstract":"While learning-based multi-view stereo (MVS) methods have recently shown successful performances in quality and efficiency, limited MVS data hampers generalization to unseen environments. A simple solution is to generate various large-scale MVS datasets, but generating dense ground truth for 3D structure requires a huge amount of time and resources. On the other hand, if the reliance on dense ground truth is relaxed, MVS systems will generalize more smoothly to new environments. To this end, we first introduce a novel semi-supervised multi-view stereo framework called a Sparse Ground truth-based MVS Network (SGT-MVSNet) that can reliably reconstruct the 3D structures even with a few ground truth 3D points. Our strategy is to divide the accurate and erroneous regions and individually conquer them based on our observation that a probability map can separate these regions. We propose a self-supervision loss called the 3D Point Consistency Loss to enhance the 3D reconstruction performance, which forces the 3D points back-projected from the corresponding pixels by the predicted depth values to meet at the same 3D co-ordinates. Finally, we propagate these improved depth pre-dictions toward edges and occlusions by the Coarse-to-fine Reliable Depth Propagation module. We generate the spare ground truth of the DTU dataset for evaluation and extensive experiments verify that our SGT-MVSNet outperforms the state-of-the-art MVS methods on the sparse ground truth setting. Moreover, our method shows comparable reconstruction results to the supervised MVS methods though we only used tens and hundreds of ground truth 3D points.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"7 1","pages":"6158-6166"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88830575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Robust Small-scale Pedestrian Detection with Cued Recall via Memory Learning 基于记忆学习的线索回忆鲁棒小尺度行人检测

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00304

Jung Uk Kim, Sungjune Park, Yong Man Ro

引用次数: 29

Learning Causal Representation for Training Cross-Domain Pose Estimator via Generative Interventions 基于生成干预的跨域姿态估计器训练的因果表示学习

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.01108

Xiheng Zhang, Yongkang Wong, Xiaofei Wu, Juwei Lu, Mohan S. Kankanhalli, Xiangdong Li, Wei-dong Geng

{"title":"Learning Causal Representation for Training Cross-Domain Pose Estimator via Generative Interventions","authors":"Xiheng Zhang, Yongkang Wong, Xiaofei Wu, Juwei Lu, Mohan S. Kankanhalli, Xiangdong Li, Wei-dong Geng","doi":"10.1109/ICCV48922.2021.01108","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.01108","url":null,"abstract":"3D pose estimation has attracted increasing attention with the availability of high-quality benchmark datasets. However, prior works show that deep learning models tend to learn spurious correlations, which fail to generalize beyond the specific dataset they are trained on. In this work, we take a step towards training robust models for cross-domain pose estimation task, which brings together ideas from causal representation learning and generative adversarial networks. Specifically, this paper introduces a novel framework for causal representation learning which explicitly exploits the causal structure of the task. We consider changing domain as interventions on images under the data-generation process and steer the generative model to produce counterfactual features. This help the model learn transferable and causal relations across different domains. Our framework is able to learn with various types of unlabeled datasets. We demonstrate the efficacy of our proposed method on both human and hand pose estimation task. The experiment results show the proposed approach achieves state-of-the-art performance on most datasets for both domain adaptation and domain generalization settings.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"4 1","pages":"11250-11260"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81209676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Synthesized Feature based Few-Shot Class-Incremental Learning on a Mixture of Subspaces 混合子空间上基于合成特征的少镜头类增量学习

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00854

A. Cheraghian, Shafin Rahman, Sameera Ramasinghe, Pengfei Fang, Christian Simon, L. Petersson, Mehrtash Harandi

{"title":"Synthesized Feature based Few-Shot Class-Incremental Learning on a Mixture of Subspaces","authors":"A. Cheraghian, Shafin Rahman, Sameera Ramasinghe, Pengfei Fang, Christian Simon, L. Petersson, Mehrtash Harandi","doi":"10.1109/ICCV48922.2021.00854","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00854","url":null,"abstract":"Few-shot class incremental learning (FSCIL) aims to incrementally add sets of novel classes to a well-trained base model in multiple training sessions with the restriction that only a few novel instances are available per class. While learning novel classes, FSCIL methods gradually forget base (old) class training and overfit to a few novel class samples. Existing approaches have addressed this problem by computing the class prototypes from the visual or semantic word vector domain. In this paper, we propose addressing this problem using a mixture of subspaces. Subspaces define the cluster structure of the visual domain and help to describe the visual and semantic domain considering the overall distribution of the data. Additionally, we propose to employ a variational autoencoder (VAE) to generate synthesized visual samples for augmenting pseudo-feature while learning novel classes incrementally. The combined effect of the mixture of subspaces and synthesized features reduces the forgetting and overfitting problem of FSCIL. Extensive experiments on three image classification datasets show that our proposed method achieves competitive results compared to state-of-the-art methods.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"3 1","pages":"8641-8650"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89276871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42

Deep Halftoning with Reversible Binary Pattern 具有可逆二进制图案的深半色调

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.01374

Menghan Xia, Wenbo Hu, Xueting Liu, T. Wong

{"title":"Deep Halftoning with Reversible Binary Pattern","authors":"Menghan Xia, Wenbo Hu, Xueting Liu, T. Wong","doi":"10.1109/ICCV48922.2021.01374","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.01374","url":null,"abstract":"Existing halftoning algorithms usually drop colors and fine details when dithering color images with binary dot patterns, which makes it extremely difficult to recover the original information. To dispense the recovery trouble in future, we propose a novel halftoning technique that converts a color image into binary halftone with full restorability to the original version. The key idea is to implicitly embed those previously dropped information into the halftone patterns. So, the halftone pattern not only serves to reproduce the image tone, maintain the blue-noise randomness, but also represents the color information and fine details. To this end, we exploit two collaborative convolutional neural networks (CNNs) to learn the dithering scheme, under a nontrivial self-supervision formulation. To tackle the flatness degradation issue of CNNs, we propose a novel noise incentive block (NIB) that can serve as a generic CNN plug-in for performance promotion. At last, we tailor a guiding-aware training scheme that secures the convergence direction as regulated. We evaluate the invertible halftones in multiple aspects, which evidences the effectiveness of our method.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"82 1","pages":"13980-13989"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86647867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Separable Flow: Learning Motion Cost Volumes for Optical Flow Estimation 可分离流:光流估计的学习运动代价体积

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.01063

Feihu Zhang, Oliver J. Woodford, V. Prisacariu, Philip H. S. Torr

引用次数: 63

Task Switching Network for Multi-task Learning 多任务学习的任务交换网络

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00818

Guolei Sun, Thomas Probst, D. Paudel, Nikola Popovic, Menelaos Kanakis, Jagruti R. Patel, Dengxin Dai, L. Gool

引用次数: 30

Improving Low-Precision Network Quantization via Bin Regularization 利用Bin正则化改进低精度网络量化

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00521

Tiantian Han, Dong Li, Ji Liu, Lu Tian, Yi Shan

{"title":"Improving Low-Precision Network Quantization via Bin Regularization","authors":"Tiantian Han, Dong Li, Ji Liu, Lu Tian, Yi Shan","doi":"10.1109/ICCV48922.2021.00521","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00521","url":null,"abstract":"Model quantization is an important mechanism for energy-efficient deployment of deep neural networks on resource-constrained devices by reducing the bit precision of weights and activations. However, it remains challenging to maintain high accuracy as bit precision decreases, especially for low-precision networks (e.g., 2-bit MobileNetV2). Existing methods have been explored to address this problem by minimizing the quantization error or mimicking the data distribution of full-precision networks. In this work, we propose a novel weight regularization algorithm for improving low-precision network quantization. Instead of constraining the overall data distribution, we separably optimize all elements in each quantization bin to be as close to the target quantized value as possible. Such bin regularization (BR) mechanism encourages the weight distribution of each quantization bin to be sharp and approximate to a Dirac delta distribution ideally. Experiments demonstrate that our method achieves consistent improvements over the state-of-the-art quantization-aware training methods for different low-precision networks. Particularly, our bin regularization improves LSQ for 2-bit MobileNetV2 and MobileNetV3-Small by 3.9% and 4.9% top-1 accuracy on ImageNet, respectively.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"102 1","pages":"5241-5250"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78164413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Detecting Persuasive Atypicality by Modeling Contextual Compatibility 基于上下文兼容性建模的说服性非典型性检测

2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00101

M. Guo, R. Hwa, Adriana Kovashka

引用次数: 6