2017 IEEE International Conference on Computer Vision Workshops (ICCVW)最新文献

筛选
英文 中文
UCT: Learning Unified Convolutional Networks for Real-Time Visual Tracking UCT:学习统一卷积网络用于实时视觉跟踪
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-11-10 DOI: 10.1109/ICCVW.2017.231
Zheng Zhu, Guan Huang, Wei Zou, Dalong Du, Chang Huang
{"title":"UCT: Learning Unified Convolutional Networks for Real-Time Visual Tracking","authors":"Zheng Zhu, Guan Huang, Wei Zou, Dalong Du, Chang Huang","doi":"10.1109/ICCVW.2017.231","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.231","url":null,"abstract":"Convolutional neural networks (CNN) based tracking approaches have shown favorable performance in recent benchmarks. Nonetheless, the chosen CNN features are always pre-trained in different task and individual components in tracking systems are learned separately, thus the achieved tracking performance may be suboptimal. Besides, most of these trackers are not designed towards realtime applications because of their time-consuming feature extraction and complex optimization details. In this paper, we propose an end-to-end framework to learn the convolutional features and perform the tracking process simultaneously, namely, a unified convolutional tracker (UCT). Specifically, The UCT treats feature extractor and tracking process (ridge regression) both as convolution operation and trains them jointly, enabling learned CNN features are tightly coupled to tracking process. In online tracking, an efficient updating method is proposed by introducing peak-versus-noise ratio (PNR) criterion, and scale changes are handled efficiently by incorporating a scale branch into network. The proposed approach results in superior tracking performance, while maintaining real-time speed. The standard UCT and UCT-Lite can track generic objects at 41 FPS and 154 FPS without further optimization, respectively. Experiments are performed on four challenging benchmark tracking datasets: OTB2013, OTB2015, VOT2014 and VOT2015, and our method achieves state-of-the-art results on these benchmarks compared with other real-time trackers.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132276624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
Ancient Roman Coin Recognition in the Wild Using Deep Learning Based Recognition of Artistically Depicted Face Profiles 使用基于艺术描绘的面部轮廓识别的深度学习在野外识别古罗马硬币
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-29 DOI: 10.1109/ICCVW.2017.342
Imanol Schlag, Ognjen Arandjelovic
{"title":"Ancient Roman Coin Recognition in the Wild Using Deep Learning Based Recognition of Artistically Depicted Face Profiles","authors":"Imanol Schlag, Ognjen Arandjelovic","doi":"10.1109/ICCVW.2017.342","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.342","url":null,"abstract":"As a particularly interesting application in the realm of cultural heritage on the one hand, and a technically challenging problem, computer vision based analysis of Roman Imperial coins has been attracting an increasing amount of research. In this paper we make several important contributions. Firstly, we address a key limitation of existing work which is largely characterized by the application of generic object recognition techniques and the lack of use of domain knowledge. In contrast, our work approaches coin recognition in much the same way as a human expert would: by identifying the emperor universally shown on the obverse. To this end we develop a deep convolutional network, carefully crafted for what is effectively a specific instance of profile face recognition. No less importantly, we also address a major methodological flaw of previous research which is, as we explain in detail, insufficiently systematic and rigorous, and mired with confounding factors. Lastly, we introduce three carefully collected and annotated data sets, and using these demonstrate the effectiveness of the proposed approach which is shown to exceed the performance of the state of the art by approximately an order of magnitude.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"248 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128136182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Particle Filter Based Probabilistic Forced Alignment for Continuous Gesture Recognition 基于粒子滤波的连续手势识别概率强制对齐
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-29 DOI: 10.1109/ICCVW.2017.364
Necati Cihan Camgöz, Simon Hadfield, R. Bowden
{"title":"Particle Filter Based Probabilistic Forced Alignment for Continuous Gesture Recognition","authors":"Necati Cihan Camgöz, Simon Hadfield, R. Bowden","doi":"10.1109/ICCVW.2017.364","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.364","url":null,"abstract":"In this paper, we propose a novel particle filter based probabilistic forced alignment approach for training spatiotemporal deep neural networks using weak border level annotations. The proposed method jointly learns to localize and recognize isolated instances in continuous streams. This is done by drawing training volumes from a prior distribution of likely regions and training a discriminative 3D-CNN from this data. The classifier is then used to calculate the posterior distribution by scoring the training examples and using this as the prior for the next sampling stage. We apply the proposed approach to the challenging task of large-scale user-independent continuous gesture recognition. We evaluate the performance on the popular ChaLearn 2016 Continuous Gesture Recognition (ConGD) dataset. Our method surpasses state-of-the-art results by obtaining 0.3646 and 0.3744 Mean Jaccard Index Score on the validation and test sets of ConGD, respectively. Furthermore, we participated in the ChaLearn 2017 Continuous Gesture Recognition Challenge and was ranked 3rd. It should be noted that our method is learner independent, it can be easily combined with other approaches.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125991351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Propagation of Orientation Uncertainty of 3D Rigid Object to Its Points 三维刚体方向不确定性向其点的传播
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-29 DOI: 10.1109/ICCVW.2017.255
M. Franaszek, G. Cheok
{"title":"Propagation of Orientation Uncertainty of 3D Rigid Object to Its Points","authors":"M. Franaszek, G. Cheok","doi":"10.1109/ICCVW.2017.255","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.255","url":null,"abstract":"If a CAD model of a rigid object is available, the location of any point on an object can be derived from the measured 6DOF pose of the object. However, the uncertainty of the measured pose propagates to the uncertainty of the point in an anisotropic way. We investigate this propagation for a class of systems that determine an object pose by using point-based rigid body registration. For such systems, the uncertainty in the location of the points used for registration propagates to the pose uncertainty. We find that for different poses of the object, the direction corresponding to the smallest propagated uncertainty remains relatively unchanged in the object's local frame, regardless of object pose. We show that this direction may be closely approximated by the moment of inertia axis which is based on the configuration of the fiducials. We use existing theory of rigid-body registration to explain the experimental results, discuss the limitations of the theory and practical implications of our findings.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131873501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
BEHAVE — Behavioral Analysis of Visual Events for Assisted Living Scenarios 辅助生活场景中视觉事件的行为分析
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-28 DOI: 10.1109/ICCVW.2017.160
Jonas Vlasselaer, C. Crispim, F. Brémond, Anton Dries
{"title":"BEHAVE — Behavioral Analysis of Visual Events for Assisted Living Scenarios","authors":"Jonas Vlasselaer, C. Crispim, F. Brémond, Anton Dries","doi":"10.1109/ICCVW.2017.160","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.160","url":null,"abstract":"This paper proposes BEHAVE, a person-centered pipeline for probabilistic event recognition. The proposed pipeline firstly detects the set of people in a video frame, then it searches for correspondences between people in the current and previous frames (i.e., people tracking). Finally, event recognition is carried for each person using probabilistic logic models (PLMs, ProbLog2 language). PLMs represent interactions among people, home appliances and semantic regions. They also enable one to assess the probability of an event given noisy observations of the real world. BEHAVE was evaluated on the task of online (non-clipped videos) and open-set event recognition (e.g., target events plus none class) on video recordings of seniors carrying out daily tasks. Results have shown that BEHAVE improves event recognition accuracy by handling missed and partially satisfied logic models. Future work will investigate how to extend PLMs to represent temporal relations among events.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122806533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Spatially-Variant Kernel for Optical Flow Under Low Signal-to-Noise Ratios Application to Microscopy 低信噪比下光流空间变核在显微镜中的应用
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-23 DOI: 10.1109/ICCVW.2017.12
Denis Fortun, N. Debroux, C. Kervrann
{"title":"Spatially-Variant Kernel for Optical Flow Under Low Signal-to-Noise Ratios Application to Microscopy","authors":"Denis Fortun, N. Debroux, C. Kervrann","doi":"10.1109/ICCVW.2017.12","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.12","url":null,"abstract":"Local and global approaches can be identified as the two main classes of optical flow estimation methods. In this paper, we propose a framework to combine the advantages of these two principles, namely robustness to noise of the local approach and discontinuity preservation of the global approach. This is particularly crucial in biological imaging, where the noise produced by microscopes is one of the main issues for optical flow estimation. The idea is to adapt spatially the local support of the local parametric constraint in the combined local-global model [6]. To this end, we jointly estimate the motion field and the parameters of the spatial support. We apply our approach to the case of Gaussian filtering, and we derive efficient minimization schemes for usual data terms. The estimation of a spatially varying standard deviation map prevents from the smoothing of motion discontinuities, while ensuring robustness to noise. We validate our method for a standard model and demonstrate how a baseline approach with pixel-wise data term can be improved when integrated in our framework. The method is evaluated on the Middlebury benchmark with ground truth and on real fluorescence microscopy data.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123657543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human Action Recognition: Pose-Based Attention Draws Focus to Hands 人类动作识别:基于姿势的注意力将焦点吸引到手上
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-23 DOI: 10.1109/ICCVW.2017.77
Fabien Baradel, Christian Wolf, J. Mille
{"title":"Human Action Recognition: Pose-Based Attention Draws Focus to Hands","authors":"Fabien Baradel, Christian Wolf, J. Mille","doi":"10.1109/ICCVW.2017.77","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.77","url":null,"abstract":"We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to most important human hands and detect the most discriminative moments in an action. Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable. In contrast to standard soft-attention based mechanisms, our approach does not use the hidden RNN state as input to the attention model. Instead, attention distributions are drawn using external information: human articulated pose. We performed an extensive ablation study to show the strengths of this approach and we particularly studied the conditioning aspect of the attention mechanism. We evaluate the method on the largest currently available human action recognition dataset, NTU-RGB+D, and report state-of-the-art results. Another advantage of our model are certains aspects of explanability, as the spatial and temporal attention distributions at test time allow to study and verify on which parts of the input data the method focuses.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129745182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking 在多说话人跟踪中利用视听数据的互补性
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-23 DOI: 10.1109/ICCVW.2017.60
Yutong Ban, Laurent Girin, Xavier Alameda-Pineda, R. Horaud
{"title":"Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking","authors":"Yutong Ban, Laurent Girin, Xavier Alameda-Pineda, R. Horaud","doi":"10.1109/ICCVW.2017.60","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.60","url":null,"abstract":"Multi-speaker tracking is a central problem in human-robot interaction. In this context, exploiting auditory and visual information is gratifying and challenging at the same time. Gratifying because the complementary nature of auditory and visual information allows us to be more robust against noise and outliers than unimodal approaches. Challenging because how to properly fuse auditory and visual information for multi-speaker tracking is far from being a solved problem. In this paper we propose a probabilistic generative model that tracks multiple speakers by jointly exploiting auditory and visual features in their own representation spaces. Importantly, the method is robust to missing data and is therefore able to track even when observations from one of the modalities are absent. Quantitative and qualitative results on the AVDIAR dataset are reported.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127732121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Registration of RGB and Thermal Point Clouds Generated by Structure From Motion 由运动结构生成的RGB和热点云的配准
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.57
Trong Phuc Truong, M. Yamaguchi, Shohei Mori, Vincent Nozick, H. Saito
{"title":"Registration of RGB and Thermal Point Clouds Generated by Structure From Motion","authors":"Trong Phuc Truong, M. Yamaguchi, Shohei Mori, Vincent Nozick, H. Saito","doi":"10.1109/ICCVW.2017.57","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.57","url":null,"abstract":"Thermal imaging has become a valuable tool in various fields for remote sensing and can provide relevant information to perform object recognition or classification. In this paper, we present an automated method to obtain a 3D model fusing data from a visible and a thermal camera. The RGB and thermal point clouds are generated independently by structure from motion. The registration process includes a normalization of the point cloud scale, a global registration based on calibration data and the output of the structure from motion, and a fine registration employing a variant of the Iterative Closest Point optimization. Experimental results demonstrate the accuracy and robustness of the overall process.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123671369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A Handcrafted Normalized-Convolution Network for Texture Classification 纹理分类的手工归一化卷积网络
2017 IEEE International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.149
Ngoc-Son Vu, Vu-Lam Nguyen, P. Gosselin
{"title":"A Handcrafted Normalized-Convolution Network for Texture Classification","authors":"Ngoc-Son Vu, Vu-Lam Nguyen, P. Gosselin","doi":"10.1109/ICCVW.2017.149","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.149","url":null,"abstract":"In this paper, we propose a Handcrafted Normalized-Convolution Network (NmzNet) for efficient texture classification. NmzNet is implemented by a three-layer normalized convolution network, which computes successive normalized convolution with a predefined filter bank (Gabor filter bank) and modulus non-linearities. Coefficients from different layers are aggregated by Fisher Vector aggregation to form the final discriminative features. The results of experimental evaluation on three texture datasets UIUC, KTH-TIPS-2a, and KTH-TIPS-2b indicate that our proposed approach achieves the good classification rate compared with other handcrafted methods. The results additionally indicate that only a marginal difference exists between the best classification rate of recent frontiers CNN and that of the proposed method on the experimented datasets.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128927015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信