2021 IEEE/CVF International Conference on Computer Vision (ICCV)最新文献

筛选
英文 中文
Weakly Supervised Segmentation of Small Buildings with Point Labels 带点标签的小建筑物弱监督分割
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00731
Jae-Hun Lee, ChanYoung Kim, S. Sull
{"title":"Weakly Supervised Segmentation of Small Buildings with Point Labels","authors":"Jae-Hun Lee, ChanYoung Kim, S. Sull","doi":"10.1109/ICCV48922.2021.00731","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00731","url":null,"abstract":"Most supervised image segmentation methods require delicate and time-consuming pixel-level labeling of building or objects, especially for small objects. In this paper, we present a weakly supervised segmentation network for aerial/satellite images, separately considering small and large objects. First, we propose a simple point labeling method for small objects, while large objects are fully labeled. Then, we present a segmentation network trained with a small object mask to separate small and large objects in the loss function. During training, we employ a memory bank to cope with the limited number of point labels. Experiments results with three public datasets demonstrate the feasibility of our approach.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"17 1","pages":"7386-7395"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90498586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Just a Few Points are All You Need for Multi-view Stereo: A Novel Semi-supervised Learning Method for Multi-view Stereo 多视点立体:一种新的半监督式多视点立体学习方法
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00612
Taekyung Kim, Jaehoon Choi, Seokeon Choi, Dongki Jung, Changick Kim
{"title":"Just a Few Points are All You Need for Multi-view Stereo: A Novel Semi-supervised Learning Method for Multi-view Stereo","authors":"Taekyung Kim, Jaehoon Choi, Seokeon Choi, Dongki Jung, Changick Kim","doi":"10.1109/ICCV48922.2021.00612","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00612","url":null,"abstract":"While learning-based multi-view stereo (MVS) methods have recently shown successful performances in quality and efficiency, limited MVS data hampers generalization to unseen environments. A simple solution is to generate various large-scale MVS datasets, but generating dense ground truth for 3D structure requires a huge amount of time and resources. On the other hand, if the reliance on dense ground truth is relaxed, MVS systems will generalize more smoothly to new environments. To this end, we first introduce a novel semi-supervised multi-view stereo framework called a Sparse Ground truth-based MVS Network (SGT-MVSNet) that can reliably reconstruct the 3D structures even with a few ground truth 3D points. Our strategy is to divide the accurate and erroneous regions and individually conquer them based on our observation that a probability map can separate these regions. We propose a self-supervision loss called the 3D Point Consistency Loss to enhance the 3D reconstruction performance, which forces the 3D points back-projected from the corresponding pixels by the predicted depth values to meet at the same 3D co-ordinates. Finally, we propagate these improved depth pre-dictions toward edges and occlusions by the Coarse-to-fine Reliable Depth Propagation module. We generate the spare ground truth of the DTU dataset for evaluation and extensive experiments verify that our SGT-MVSNet outperforms the state-of-the-art MVS methods on the sparse ground truth setting. Moreover, our method shows comparable reconstruction results to the supervised MVS methods though we only used tens and hundreds of ground truth 3D points.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"7 1","pages":"6158-6166"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88830575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Robust Small-scale Pedestrian Detection with Cued Recall via Memory Learning 基于记忆学习的线索回忆鲁棒小尺度行人检测
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00304
Jung Uk Kim, Sungjune Park, Yong Man Ro
{"title":"Robust Small-scale Pedestrian Detection with Cued Recall via Memory Learning","authors":"Jung Uk Kim, Sungjune Park, Yong Man Ro","doi":"10.1109/ICCV48922.2021.00304","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00304","url":null,"abstract":"Although the visual appearances of small-scale objects are not well observed, humans can recognize them by associating the visual cues of small objects from their memorized appearance. It is called cued recall. In this paper, motivated by the memory process of humans, we introduce a novel pedestrian detection framework that imitates cued recall in detecting small-scale pedestrians. We propose a large-scale embedding learning with the large-scale pedestrian recalling memory (LPR Memory). The purpose of the proposed large-scale embedding learning is to memorize and recall the large-scale pedestrian appearance via the LPR Memory. To this end, we employ the large-scale pedestrian exemplar set, so that, the LPR Memory can recall the information of the large-scale pedestrians from the small-scale pedestrians. Comprehensive quantitative and qualitative experimental results validate the effectiveness of the proposed framework with the LPR Memory.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"93 1","pages":"3030-3039"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80803930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Learning Causal Representation for Training Cross-Domain Pose Estimator via Generative Interventions 基于生成干预的跨域姿态估计器训练的因果表示学习
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.01108
Xiheng Zhang, Yongkang Wong, Xiaofei Wu, Juwei Lu, Mohan S. Kankanhalli, Xiangdong Li, Wei-dong Geng
{"title":"Learning Causal Representation for Training Cross-Domain Pose Estimator via Generative Interventions","authors":"Xiheng Zhang, Yongkang Wong, Xiaofei Wu, Juwei Lu, Mohan S. Kankanhalli, Xiangdong Li, Wei-dong Geng","doi":"10.1109/ICCV48922.2021.01108","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.01108","url":null,"abstract":"3D pose estimation has attracted increasing attention with the availability of high-quality benchmark datasets. However, prior works show that deep learning models tend to learn spurious correlations, which fail to generalize beyond the specific dataset they are trained on. In this work, we take a step towards training robust models for cross-domain pose estimation task, which brings together ideas from causal representation learning and generative adversarial networks. Specifically, this paper introduces a novel framework for causal representation learning which explicitly exploits the causal structure of the task. We consider changing domain as interventions on images under the data-generation process and steer the generative model to produce counterfactual features. This help the model learn transferable and causal relations across different domains. Our framework is able to learn with various types of unlabeled datasets. We demonstrate the efficacy of our proposed method on both human and hand pose estimation task. The experiment results show the proposed approach achieves state-of-the-art performance on most datasets for both domain adaptation and domain generalization settings.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"4 1","pages":"11250-11260"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81209676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Synthesized Feature based Few-Shot Class-Incremental Learning on a Mixture of Subspaces 混合子空间上基于合成特征的少镜头类增量学习
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00854
A. Cheraghian, Shafin Rahman, Sameera Ramasinghe, Pengfei Fang, Christian Simon, L. Petersson, Mehrtash Harandi
{"title":"Synthesized Feature based Few-Shot Class-Incremental Learning on a Mixture of Subspaces","authors":"A. Cheraghian, Shafin Rahman, Sameera Ramasinghe, Pengfei Fang, Christian Simon, L. Petersson, Mehrtash Harandi","doi":"10.1109/ICCV48922.2021.00854","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00854","url":null,"abstract":"Few-shot class incremental learning (FSCIL) aims to incrementally add sets of novel classes to a well-trained base model in multiple training sessions with the restriction that only a few novel instances are available per class. While learning novel classes, FSCIL methods gradually forget base (old) class training and overfit to a few novel class samples. Existing approaches have addressed this problem by computing the class prototypes from the visual or semantic word vector domain. In this paper, we propose addressing this problem using a mixture of subspaces. Subspaces define the cluster structure of the visual domain and help to describe the visual and semantic domain considering the overall distribution of the data. Additionally, we propose to employ a variational autoencoder (VAE) to generate synthesized visual samples for augmenting pseudo-feature while learning novel classes incrementally. The combined effect of the mixture of subspaces and synthesized features reduces the forgetting and overfitting problem of FSCIL. Extensive experiments on three image classification datasets show that our proposed method achieves competitive results compared to state-of-the-art methods.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"3 1","pages":"8641-8650"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89276871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Deep Halftoning with Reversible Binary Pattern 具有可逆二进制图案的深半色调
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.01374
Menghan Xia, Wenbo Hu, Xueting Liu, T. Wong
{"title":"Deep Halftoning with Reversible Binary Pattern","authors":"Menghan Xia, Wenbo Hu, Xueting Liu, T. Wong","doi":"10.1109/ICCV48922.2021.01374","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.01374","url":null,"abstract":"Existing halftoning algorithms usually drop colors and fine details when dithering color images with binary dot patterns, which makes it extremely difficult to recover the original information. To dispense the recovery trouble in future, we propose a novel halftoning technique that converts a color image into binary halftone with full restorability to the original version. The key idea is to implicitly embed those previously dropped information into the halftone patterns. So, the halftone pattern not only serves to reproduce the image tone, maintain the blue-noise randomness, but also represents the color information and fine details. To this end, we exploit two collaborative convolutional neural networks (CNNs) to learn the dithering scheme, under a nontrivial self-supervision formulation. To tackle the flatness degradation issue of CNNs, we propose a novel noise incentive block (NIB) that can serve as a generic CNN plug-in for performance promotion. At last, we tailor a guiding-aware training scheme that secures the convergence direction as regulated. We evaluate the invertible halftones in multiple aspects, which evidences the effectiveness of our method.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"82 1","pages":"13980-13989"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86647867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Separable Flow: Learning Motion Cost Volumes for Optical Flow Estimation 可分离流:光流估计的学习运动代价体积
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.01063
Feihu Zhang, Oliver J. Woodford, V. Prisacariu, Philip H. S. Torr
{"title":"Separable Flow: Learning Motion Cost Volumes for Optical Flow Estimation","authors":"Feihu Zhang, Oliver J. Woodford, V. Prisacariu, Philip H. S. Torr","doi":"10.1109/ICCV48922.2021.01063","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.01063","url":null,"abstract":"Full-motion cost volumes play a central role in current state-of-the-art optical flow methods. However, constructed using simple feature correlations, they lack the ability to encapsulate prior, or even non-local knowledge. This creates artifacts in poorly constrained ambiguous regions, such as occluded and textureless areas. We propose a separable cost volume module, a drop-in replacement to correlation cost volumes, that uses non-local aggregation layers to exploit global context cues and prior knowledge, in order to disambiguate motions in these regions. Our method leads both the now standard Sintel and KITTI optical flow benchmarks in terms of accuracy, and is also shown to generalize better from synthetic to real data.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"33 1","pages":"10787-10797"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91283129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Task Switching Network for Multi-task Learning 多任务学习的任务交换网络
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00818
Guolei Sun, Thomas Probst, D. Paudel, Nikola Popovic, Menelaos Kanakis, Jagruti R. Patel, Dengxin Dai, L. Gool
{"title":"Task Switching Network for Multi-task Learning","authors":"Guolei Sun, Thomas Probst, D. Paudel, Nikola Popovic, Menelaos Kanakis, Jagruti R. Patel, Dengxin Dai, L. Gool","doi":"10.1109/ICCV48922.2021.00818","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00818","url":null,"abstract":"We introduce Task Switching Networks (TSNs), a task-conditioned architecture with a single unified encoder/decoder for efficient multi-task learning. Multiple tasks are performed by switching between them, performing one task at a time. TSNs have a constant number of parameters irrespective of the number of tasks. This scalable yet conceptually simple approach circumvents the overhead and intricacy of task-specific network components in existing works. In fact, we demonstrate for the first time that multi-tasking can be performed with a single task-conditioned decoder. We achieve this by learning task-specific conditioning parameters through a jointly trained task embedding network, encouraging constructive interaction between tasks. Experiments validate the effectiveness of our approach, achieving state-of-the-art results on two challenging multi-task benchmarks, PASCAL-Context and NYUD. Our analysis of the learned task embeddings further indicates a connection to task relationships studied in the recent literature.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"246 1","pages":"8271-8280"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76971889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Improving Low-Precision Network Quantization via Bin Regularization 利用Bin正则化改进低精度网络量化
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00521
Tiantian Han, Dong Li, Ji Liu, Lu Tian, Yi Shan
{"title":"Improving Low-Precision Network Quantization via Bin Regularization","authors":"Tiantian Han, Dong Li, Ji Liu, Lu Tian, Yi Shan","doi":"10.1109/ICCV48922.2021.00521","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00521","url":null,"abstract":"Model quantization is an important mechanism for energy-efficient deployment of deep neural networks on resource-constrained devices by reducing the bit precision of weights and activations. However, it remains challenging to maintain high accuracy as bit precision decreases, especially for low-precision networks (e.g., 2-bit MobileNetV2). Existing methods have been explored to address this problem by minimizing the quantization error or mimicking the data distribution of full-precision networks. In this work, we propose a novel weight regularization algorithm for improving low-precision network quantization. Instead of constraining the overall data distribution, we separably optimize all elements in each quantization bin to be as close to the target quantized value as possible. Such bin regularization (BR) mechanism encourages the weight distribution of each quantization bin to be sharp and approximate to a Dirac delta distribution ideally. Experiments demonstrate that our method achieves consistent improvements over the state-of-the-art quantization-aware training methods for different low-precision networks. Particularly, our bin regularization improves LSQ for 2-bit MobileNetV2 and MobileNetV3-Small by 3.9% and 4.9% top-1 accuracy on ImageNet, respectively.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"102 1","pages":"5241-5250"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78164413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Detecting Persuasive Atypicality by Modeling Contextual Compatibility 基于上下文兼容性建模的说服性非典型性检测
2021 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2021-10-01 DOI: 10.1109/ICCV48922.2021.00101
M. Guo, R. Hwa, Adriana Kovashka
{"title":"Detecting Persuasive Atypicality by Modeling Contextual Compatibility","authors":"M. Guo, R. Hwa, Adriana Kovashka","doi":"10.1109/ICCV48922.2021.00101","DOIUrl":"https://doi.org/10.1109/ICCV48922.2021.00101","url":null,"abstract":"We propose a new approach to detect atypicality in persuasive imagery. Unlike atypicality which has been studied in prior work, persuasive atypicality has a particular purpose to convey meaning, and relies on understanding the common-sense spatial relations of objects. We propose a self-supervised attention-based technique which captures contextual compatibility, and models spatial relations in a precise manner. We further experiment with capturing common sense through the semantics of co-occurring object classes. We verify our approach on a dataset of atypicality in visual advertisements, as well as a second dataset capturing atypicality that has no persuasive intent.","PeriodicalId":6820,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"5 1","pages":"952-962"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78278886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信