2017 IEEE International Conference on Computer Vision (ICCV)最新文献_第4页

Joint Convolutional Analysis and Synthesis Sparse Representation for Single Image Layer Separation 单幅图像层分离的联合卷积分析与合成稀疏表示

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.189

Shuhang Gu, Deyu Meng, W. Zuo, Lei Zhang

{"title":"Joint Convolutional Analysis and Synthesis Sparse Representation for Single Image Layer Separation","authors":"Shuhang Gu, Deyu Meng, W. Zuo, Lei Zhang","doi":"10.1109/ICCV.2017.189","DOIUrl":"https://doi.org/10.1109/ICCV.2017.189","url":null,"abstract":"Analysis sparse representation (ASR) and synthesis sparse representation (SSR) are two representative approaches for sparsity-based image modeling. An image is described mainly by the non-zero coefficients in SSR, while is mainly characterized by the indices of zeros in ASR. To exploit the complementary representation mechanisms of ASR and SSR, we integrate the two models and propose a joint convolutional analysis and synthesis (JCAS) sparse representation model. The convolutional implementation is adopted to more effectively exploit the image global information. In JCAS, a single image is decomposed into two layers, one is approximated by ASR to represent image large-scale structures, and the other by SSR to represent image fine-scale textures. The synthesis dictionary is adaptively learned in JCAS to describe the texture patterns for different single image layer separation tasks. We evaluate the proposed JCAS model on a variety of applications, including rain streak removal, high dynamic range image tone mapping, etc. The results show that our JCAS method outperforms state-of-the-arts in these applications in terms of both quantitative measure and visual perception quality.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"51 1","pages":"1717-1725"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76184818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 167

Self-Organized Text Detection with Minimal Post-processing via Border Learning 基于边界学习的最小后处理自组织文本检测

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.535

Yue Wu, P. Natarajan

引用次数: 75

Modeling Urban Scenes from Pointclouds 从Pointclouds建模城市场景

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.414

William Nguatem, H. Mayer

{"title":"Modeling Urban Scenes from Pointclouds","authors":"William Nguatem, H. Mayer","doi":"10.1109/ICCV.2017.414","DOIUrl":"https://doi.org/10.1109/ICCV.2017.414","url":null,"abstract":"We present a method for Modeling Urban Scenes from Pointclouds (MUSP). In contrast to existing approaches, MUSP is robust, scalable and provides a more complete description by not making a Manhattan-World assumption and modeling both buildings (with polyhedra) as well as the non-planar ground (using NURBS). First, we segment the scene into consistent patches using a divide-and-conquer based algorithm within a nonparametric Bayesian framework (stick-breaking construction). These patches often correspond to meaningful structures, such as the ground, facades, roofs and roof superstructures. We use polygon sweeping to fit predefined templates for buildings, and for the ground, a NURBS surface is fit and uniformly tessellated. Finally, we apply boolean operations to the polygons for buildings, buildings parts and the tesselated ground to clip unnecessary geometry (e.g., facades protrusions below the non-planar ground), leading to the final model. The explicit Bayesian formulation of scene segmentation makes our approach suitable for challenging datasets with varying amounts of noise, outliers, and point density. We demonstrate the robustness of MUSP on 3D pointclouds from image matching as well as LiDAR.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"3 1","pages":"3857-3866"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91545183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Primary Video Object Segmentation via Complementary CNNs and Neighborhood Reversible Flow 基于互补cnn和邻域可逆流的主视频目标分割

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.158

Jia Li, Anlin Zheng, Xiaowu Chen, Bin Zhou

{"title":"Primary Video Object Segmentation via Complementary CNNs and Neighborhood Reversible Flow","authors":"Jia Li, Anlin Zheng, Xiaowu Chen, Bin Zhou","doi":"10.1109/ICCV.2017.158","DOIUrl":"https://doi.org/10.1109/ICCV.2017.158","url":null,"abstract":"This paper proposes a novel approach for segmenting primary video objects by using Complementary Convolutional Neural Networks (CCNN) and neighborhood reversible flow. The proposed approach first pre-trains CCNN on massive images with manually annotated salient objects in an end-to-end manner, and the trained CCNN has two separate branches that simultaneously handle two complementary tasks, i.e., foregroundness and backgroundness estimation. By applying CCNN on each video frame, the spatial foregroundness and backgroundness maps can be initialized, which are then propagated between various frames so as to segment primary video objects and suppress distractors. To enforce efficient temporal propagation, we divide each frame into superpixels and construct neighborhood reversible flow that reflects the most reliable temporal correspondences between superpixels in far-away frames. Within such flow, the initialized foregroundness and backgroundness can be efficiently and accurately propagated along the temporal axis so that primary video objects gradually pop-out and distractors are well suppressed. Extensive experimental results on three video datasets show that the proposed approach achieves impressive performance in comparisons with 18 state-of-the-art models.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"96 17","pages":"1426-1434"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91406898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Multi-view Non-rigid Refinement and Normal Selection for High Quality 3D Reconstruction 高质量三维重建的多视图非刚性细化和法线选择

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.261

Sk. Mohammadul Haque, V. Govindu

{"title":"Multi-view Non-rigid Refinement and Normal Selection for High Quality 3D Reconstruction","authors":"Sk. Mohammadul Haque, V. Govindu","doi":"10.1109/ICCV.2017.261","DOIUrl":"https://doi.org/10.1109/ICCV.2017.261","url":null,"abstract":"In recent years, there have been a variety of proposals for high quality 3D reconstruction by fusion of depth and normal maps that contain good low and high frequency information respectively. Typically, these methods create an initial mesh representation of the complete object or scene being scanned. Subsequently, normal estimates are assigned to each mesh vertex and a mesh-normal fusion step is carried out. In this paper, we present a complete pipeline for such depth-normal fusion. The key innovations in our pipeline are twofold. Firstly, we introduce a global multi-view non-rigid refinement step that corrects for the non-rigid misalignment present in the depth and normal maps. We demonstrate that such a correction is crucial for preserving fine-scale 3D features in the final reconstruction. Secondly, despite adequate care, the averaging of multiple normals invariably results in blurring of3D detail. To mitigate this problem, we propose an approach that selects one out of many available normals. Our global cost for normal selection incorporates a variety of desirable properties and can be efficiently solved using graph cuts. We demonstrate the efficacy of our approach in generating high quality 3D reconstructions of both synthetic and real 3D models and compare with existing methods in the literature.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"78 1","pages":"2401-2409"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83071919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Learning-Based Cloth Material Recovery from Video 基于学习的布料材料回收视频

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.470

Shan Yang, Junbang Liang, M. Lin

引用次数: 58

Robust Object Tracking Based on Temporal and Spatial Deep Networks 基于时空深度网络的鲁棒目标跟踪

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.130

Zhu Teng, Junliang Xing, Qiang Wang, Congyan Lang, Songhe Feng, Yi Jin

引用次数: 57

Raster-to-Vector: Revisiting Floorplan Transformation 栅格到矢量:重访平面图转换

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.241

Chen Liu, Jiajun Wu, Pushmeet Kohli, Yasutaka Furukawa

{"title":"Raster-to-Vector: Revisiting Floorplan Transformation","authors":"Chen Liu, Jiajun Wu, Pushmeet Kohli, Yasutaka Furukawa","doi":"10.1109/ICCV.2017.241","DOIUrl":"https://doi.org/10.1109/ICCV.2017.241","url":null,"abstract":"This paper addresses the problem of converting a rasterized floorplan image into a vector-graphics representation. Unlike existing approaches that rely on a sequence of lowlevel image processing heuristics, we adopt a learning-based approach. A neural architecture first transforms a rasterized image to a set of junctions that represent low-level geometric and semantic information (e.g., wall corners or door end-points). Integer programming is then formulated to aggregate junctions into a set of simple primitives (e.g., wall lines, door lines, or icon boxes) to produce a vectorized floorplan, while ensuring a topologically and geometrically consistent result. Our algorithm significantly outperforms existing methods and achieves around 90% precision and recall, getting to the range of production-ready performance. The vector representation allows 3D model popup for better indoor scene visualization, direct model manipulation for architectural remodeling, and further computational applications such as data analysis. Our system is efficient: we have converted hundred thousand production-level floorplan images into the vector representation and generated 3D popup models.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"2214-2222"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90461539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 142

Multimodal Gaussian Process Latent Variable Models with Harmonization 具有协调的多模态高斯过程潜变量模型

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.538

Guoli Song, Shuhui Wang, Qingming Huang, Q. Tian

引用次数: 7

Cross-Modal Deep Variational Hashing 跨模态深度变分哈希

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.439

Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, Jie Zhou

引用次数: 78