2017 IEEE International Conference on Computer Vision (ICCV)最新文献

筛选
英文 中文
Open Set Domain Adaptation 开集域自适应
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.88
Pau Panareda Busto, Juergen Gall
{"title":"Open Set Domain Adaptation","authors":"Pau Panareda Busto, Juergen Gall","doi":"10.1109/ICCV.2017.88","DOIUrl":"https://doi.org/10.1109/ICCV.2017.88","url":null,"abstract":"When the training and the test data belong to different domains, the accuracy of an object classifier is significantly reduced. Therefore, several algorithms have been proposed in the last years to diminish the so called domain shift between datasets. However, all available evaluation protocols for domain adaptation describe a closed set recognition task, where both domains, namely source and target, contain exactly the same object classes. In this work, we also explore the field of domain adaptation in open sets, which is a more realistic scenario where only a few categories of interest are shared between source and target data. Therefore, we propose a method that fits in both closed and open set scenarios. The approach learns a mapping from the source to the target domain by jointly solving an assignment problem that labels those target instances that potentially belong to the categories of interest present in the source dataset. A thorough evaluation shows that our approach outperforms the state-of-the-art.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"30 1","pages":"754-763"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87977768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 457
Ensemble Diffusion for Retrieval 用于检索的集合扩散
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.90
S. Bai, Zhichao Zhou, Jingdong Wang, X. Bai, Longin Jan Latecki, Q. Tian
{"title":"Ensemble Diffusion for Retrieval","authors":"S. Bai, Zhichao Zhou, Jingdong Wang, X. Bai, Longin Jan Latecki, Q. Tian","doi":"10.1109/ICCV.2017.90","DOIUrl":"https://doi.org/10.1109/ICCV.2017.90","url":null,"abstract":"As a postprocessing procedure, diffusion process has demonstrated its ability of substantially improving the performance of various visual retrieval systems. Whereas, great efforts are also devoted to similarity (or metric) fusion, seeing that only one individual type of similarity cannot fully reveal the intrinsic relationship between objects. This stimulates a great research interest of considering similarity fusion in the framework of diffusion process (i.e., fusion with diffusion) for robust retrieval.,,In this paper, we firstly revisit representative methods about fusion with diffusion, and provide new insights which are ignored by previous researchers. Then, observing that existing algorithms are susceptible to noisy similarities, the proposed Regularized Ensemble Diffusion (RED) is bundled with an automatic weight learning paradigm, so that the negative impacts of noisy similarities are suppressed. At last, we integrate several recently-proposed similarities with the proposed framework. The experimental results suggest that we can achieve new state-of-the-art performances on various retrieval tasks, including 3D shape retrieval on ModelNet dataset, and image retrieval on Holidays and Ukbench dataset.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"774-783"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84836487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
Flip-Invariant Motion Representation 翻转不变运动表示
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.600
Takumi Kobayashi
{"title":"Flip-Invariant Motion Representation","authors":"Takumi Kobayashi","doi":"10.1109/ICCV.2017.600","DOIUrl":"https://doi.org/10.1109/ICCV.2017.600","url":null,"abstract":"In action recognition, local motion descriptors contribute to effectively representing video sequences where target actions appear in localized spatio-temporal regions. For robust recognition, those fundamental descriptors are required to be invariant against horizontal (mirror) flipping in video frames which frequently occurs due to changes of camera viewpoints and action directions, deteriorating classification performance. In this paper, we propose methods to render flip invariance to the local motion descriptors by two approaches. One method leverages local motion flows to ensure the invariance on input patches where the descriptors are computed. The other derives a invariant form theoretically from the flipping transformation applied to hand-crafted descriptors. The method is also extended so as to deal with ConvNet descriptors through learning the invariant form based on data. The experimental results on human action classification show that the proposed methods favorably improve performance both of the handcrafted and the ConvNet descriptors.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"17 1","pages":"5629-5638"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82679397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework 基于稀疏编码的堆叠RNN异常检测研究
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.45
Weixin Luo, Wen Liu, Shenghua Gao
{"title":"A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework","authors":"Weixin Luo, Wen Liu, Shenghua Gao","doi":"10.1109/ICCV.2017.45","DOIUrl":"https://doi.org/10.1109/ICCV.2017.45","url":null,"abstract":"Motivated by the capability of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC) where we enforce similar neighbouring frames be encoded with similar reconstruction coefficients. Then we map the TSC with a special type of stacked Recurrent Neural Network (sRNN). By taking advantage of sRNN in learning all parameters simultaneously, the nontrivial hyper-parameter selection to TSC can be avoided, meanwhile with a shallow sRNN, the reconstruction coefficients can be inferred within a forward pass, which reduces the computational cost for learning sparse coefficients. The contributions of this paper are two-fold: i) We propose a TSC, which can be mapped to a sRNN which facilitates the parameter optimization and accelerates the anomaly prediction. ii) We build a very large dataset which is even larger than the summation of all existing dataset for anomaly detection in terms of both the volume of data and the diversity of scenes. Extensive experiments on both a toy dataset and real datasets demonstrate that our TSC based and sRNN based method consistently outperform existing methods, which validates the effectiveness of our method.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"56 1","pages":"341-349"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83111502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 484
Editable Parametric Dense Foliage from 3D Capture 可编辑的参数密集树叶从3D捕获
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.567
P. Beardsley, G. Chaurasia
{"title":"Editable Parametric Dense Foliage from 3D Capture","authors":"P. Beardsley, G. Chaurasia","doi":"10.1109/ICCV.2017.567","DOIUrl":"https://doi.org/10.1109/ICCV.2017.567","url":null,"abstract":"We present an algorithm to compute parametric models of dense foliage. The guiding principles of our work are automatic reconstruction and compact artist friendly representation. We use Bezier patches to model leaf surface, which we compute from images and point clouds of dense foliage. We present an algorithm to segment individual leaves from colour and depth data. We then reconstruct the Bezier representation from segmented leaf points clouds using non-linear optimisation. Unlike previous work, we do not require laboratory scanned exemplars or user intervention. We also demonstrate intuitive manipulators to edit the reconstructed parametric models. We believe our work is a step towards making captured data more accessible to artists for foliage modelling.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"79 1","pages":"5315-5324"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84148186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Deep Free-Form Deformation Network for Object-Mask Registration 对象-掩码配准的深度自由变形网络
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.456
Haoyang Zhang, Xuming He
{"title":"Deep Free-Form Deformation Network for Object-Mask Registration","authors":"Haoyang Zhang, Xuming He","doi":"10.1109/ICCV.2017.456","DOIUrl":"https://doi.org/10.1109/ICCV.2017.456","url":null,"abstract":"This paper addresses the problem of object-mask registration, which aligns a shape mask to a target object instance. Prior work typically formulate the problem as an object segmentation task with mask prior, which is challenging to solve. In this work, we take a transformation based approach that predicts a 2D non-rigid spatial transform and warps the shape mask onto the target object. In particular, we propose a deep spatial transformer network that learns free-form deformations (FFDs) to non-rigidly warp the shape mask based on a multi-level dual mask feature pooling strategy. The FFD transforms are based on B-splines and parameterized by the offsets of predefined control points, which are differentiable. Therefore, we are able to train the entire network in an end-to-end manner based on L2 matching loss. We evaluate our FFD network on a challenging object-mask alignment task, which aims to refine a set of object segment proposals, and our approach achieves the state-of-the-art performance on the Cityscapes, the PASCAL VOC and the MSCOCO datasets.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"19 1","pages":"4261-4269"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81811556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos RPAN:一种用于视频动作识别的端到端递归姿态-注意网络
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.402
Wenbin Du, Yali Wang, Y. Qiao
{"title":"RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos","authors":"Wenbin Du, Yali Wang, Y. Qiao","doi":"10.1109/ICCV.2017.402","DOIUrl":"https://doi.org/10.1109/ICCV.2017.402","url":null,"abstract":"Recent studies demonstrate the effectiveness of Recurrent Neural Networks (RNNs) for action recognition in videos. However, previous works mainly utilize video-level category as supervision to train RNNs, which may prohibit RNNs to learn complex motion structures along time. In this paper, we propose a recurrent pose-attention network (RPAN) to address this challenge, where we introduce a novel pose-attention mechanism to adaptively learn pose-related features at every time-step action prediction of RNNs. More specifically, we make three main contributions in this paper. Firstly, unlike previous works on pose-related action recognition, our RPAN is an end-toend recurrent network which can exploit important spatialtemporal evolutions of human pose to assist action recognition in a unified framework. Secondly, instead of learning individual human-joint features separately, our poseattention mechanism learns robust human-part features by sharing attention parameters partially on the semanticallyrelated human joints. These human-part features are then fed into the human-part pooling layer to construct a highlydiscriminative pose-related representation for temporal action modeling. Thirdly, one important byproduct of our RPAN is pose estimation in videos, which can be used for coarse pose annotation in action videos. We evaluate the proposed RPAN quantitatively and qualitatively on two popular benchmarks, i.e., Sub-JHMDB and PennAction. Experimental results show that RPAN outperforms the recent state-of-the-art methods on these challenging datasets.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"30 1","pages":"3745-3754"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81339891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 149
Object-Level Proposals 对象级的建议
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.527
Jianxiang Ma, Anlong Ming, Zilong Huang, Xinggang Wang, Yu Zhou
{"title":"Object-Level Proposals","authors":"Jianxiang Ma, Anlong Ming, Zilong Huang, Xinggang Wang, Yu Zhou","doi":"10.1109/ICCV.2017.527","DOIUrl":"https://doi.org/10.1109/ICCV.2017.527","url":null,"abstract":"Edge and surface are two fundamental visual elements of an object. The majority of existing object proposal approaches utilize edge or edge-like cues to rank candidates, while we consider that the surface cue containing the 3D characteristic of objects should be captured effectively for proposals, which has been rarely discussed before. In this paper, an object-level proposal model is presented, which constructs an occlusion-based objectness taking the surface cue into account. Specifically, the better detection of occlusion edges is focused on to enrich the surface cue into proposals, namely, the occlusion-dominated fusion and normalization criterion are designed to obtain the approximately overall contour information, to enhance the occlusion edge map at utmost and thus boost proposals. Experimental results on the PASCAL VOC 2007 and MS COCO 2014 dataset demonstrate the effectiveness of our approach, which achieves around 6% improvement on the average recall than Edge Boxes at 1000 proposals and also leads to a modest gain on the performance of object detection.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"6 1","pages":"4931-4939"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83512891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Supervision by Fusion: Towards Unsupervised Learning of Deep Salient Object Detector 融合监督:面向深度显著目标检测器的无监督学习
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.436
Dingwen Zhang, Junwei Han, Yu Zhang
{"title":"Supervision by Fusion: Towards Unsupervised Learning of Deep Salient Object Detector","authors":"Dingwen Zhang, Junwei Han, Yu Zhang","doi":"10.1109/ICCV.2017.436","DOIUrl":"https://doi.org/10.1109/ICCV.2017.436","url":null,"abstract":"In light of the powerful learning capability of deep neural networks (DNNs), deep (convolutional) models have been built in recent years to address the task of salient object detection. Although training such deep saliency models can significantly improve the detection performance, it requires large-scale manual supervision in the form of pixel-level human annotation, which is highly labor-intensive and time-consuming. To address this problem, this paper makes the earliest effort to train a deep salient object detector without using any human annotation. The key insight is “supervision by fusion”, i.e., generating useful supervisory signals from the fusion process of weak but fast unsupervised saliency models. Based on this insight, we combine an intra-image fusion stream and a inter-image fusion stream in the proposed framework to generate the learning curriculum and pseudo ground-truth for supervising the training of the deep salient object detector. Comprehensive experiments on four benchmark datasets demonstrate that our method can approach the same network trained with full supervision (within 2-5% performance gap) and, more encouragingly, even outperform a number of fully supervised state-of-the-art approaches.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"54 1","pages":"4068-4076"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85594968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 144
Makeup-Go: Blind Reversion of Portrait Edit 化妆- go:肖像编辑的盲目还原
2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.482
Ying-Cong Chen, Xiaoyong Shen, Jiaya Jia
{"title":"Makeup-Go: Blind Reversion of Portrait Edit","authors":"Ying-Cong Chen, Xiaoyong Shen, Jiaya Jia","doi":"10.1109/ICCV.2017.482","DOIUrl":"https://doi.org/10.1109/ICCV.2017.482","url":null,"abstract":"Virtual face beautification (or markup) becomes common operations in camera or image processing Apps, which is actually deceiving. In this paper, we propose the task of restoring a portrait image from this process. As the first attempt along this line, we assume unknown global operations on human faces and aim to tackle the two issues of skin smoothing and skin color change. These two tasks, intriguingly, impose very different difficulties to estimate subtle details and major color variation. We propose a Component Regression Network (CRN) and address the limitation of using Euclidean loss in blind reversion. CRN maps the edited portrait images back to the original ones without knowing beautification operation details. Our experiments demonstrate effectiveness of the system for this novel task.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"29 1","pages":"4511-4519"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89798130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信