2016 IEEE Winter Conference on Applications of Computer Vision (WACV)最新文献

筛选
英文 中文
Deep learning the dynamic appearance and shape of facial action units 深度学习面部动作单元的动态外观和形状
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-05-26 DOI: 10.1109/WACV.2016.7477625
S. Jaiswal, M. Valstar
{"title":"Deep learning the dynamic appearance and shape of facial action units","authors":"S. Jaiswal, M. Valstar","doi":"10.1109/WACV.2016.7477625","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477625","url":null,"abstract":"Spontaneous facial expression recognition under uncontrolled conditions is a hard task. It depends on multiple factors including shape, appearance and dynamics of the facial features, all of which are adversely affected by environmental noise and low intensity signals typical of such conditions. In this work, we present a novel approach to Facial Action Unit detection using a combination of Convolutional and Bi-directional Long Short-Term Memory Neural Networks (CNN-BLSTM), which jointly learns shape, appearance and dynamics in a deep learning manner. In addition, we introduce a novel way to encode shape features using binary image masks computed from the locations of facial landmarks. We show that the combination of dynamic CNN features and Bi-directional Long Short-Term Memory excels at modelling the temporal information. We thoroughly evaluate the contributions of each component in our system and show that it achieves state-of-the-art performance on the FERA-2015 Challenge dataset.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128910832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 153
Region graph based method for multi-object detection and tracking using depth cameras 基于区域图的深度相机多目标检测与跟踪方法
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-11 DOI: 10.1109/WACV.2016.7477568
Sachin Mehta, B. Prabhakaran
{"title":"Region graph based method for multi-object detection and tracking using depth cameras","authors":"Sachin Mehta, B. Prabhakaran","doi":"10.1109/WACV.2016.7477568","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477568","url":null,"abstract":"In this paper, we propose a multi-object detection and tracking method using depth cameras. Depth maps are very noisy and obscure in object detection. We first propose a region-based method to suppress high magnitude noise which cannot be filtered using spatial filters. Second, the proposed method detect Region of Interests by temporal learning which are then tracked using weighted graph-based approach. We demonstrate the performance of the proposed method on standard depth camera datasets with and without object occlusions. Experimental results show that the proposed, method is able to suppress high magnitude noise in depth maps and detect/track the objects (with and without occlusion).","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129555365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Pose tracking by efficiently exploiting global features 有效利用全局特征的姿态跟踪
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477563
Ratnesh Kumar, Dhruv Batra
{"title":"Pose tracking by efficiently exploiting global features","authors":"Ratnesh Kumar, Dhruv Batra","doi":"10.1109/WACV.2016.7477563","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477563","url":null,"abstract":"Typical pose tracking algorithms first obtain a set of plausible pose hypotheses in all image frames of a video and subsequently stitch compatible detections across time to form a pose-track. This approach to tracking is commonly termed tracking-by-detections, and has been very successful in other areas such as multiple object tracking, video segmentation using object proposals. Often models in this category can only incorporate local spatio-temporal evidence due to exponentially increased cost when using global information. Local spatio-temporal evidence can be ambiguous, thus leading to an inferior objective modeling. To deal with ambiguities in local information it is necessary to incorporate global information over multiple frames into a model. Based on the recent advances in generating multiple solutions from a probabilistic model, we first generate multiple plausible pose-track hypotheses, and subsequently employ a mixture of local and global features to express the quality of these solutions with high fidelity. We perform extensive experiments and competitive results across varied datasets demonstrate the robustness of our approach.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123065027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Variational multi-phase segmentation using high-dimensional local features 基于高维局部特征的变分多相分割
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477729
N. Mevenkamp, B. Berkels
{"title":"Variational multi-phase segmentation using high-dimensional local features","authors":"N. Mevenkamp, B. Berkels","doi":"10.1109/WACV.2016.7477729","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477729","url":null,"abstract":"We propose a novel method for multi-phase segmentation of images based on high-dimensional local feature vectors. While the method was developed for the segmentation of extremely noisy crystal images based on localized Fourier transforms, the resulting framework is not tied to specific feature descriptors. For instance, using local spectral histograms as features, it allows for robust texture segmentation. The segmentation itself is based on the multi-phase Mumford-Shah model. Initializing the high-dimensional mean features directly is computationally too demanding and ill-posed in practice. This is resolved by projecting the features onto a low-dimensional space using principle component analysis. The resulting objective functional is minimized using a convexification and the Chambolle-Pock algorithm. Numerical results are presented, illustrating that the algorithm is very competitive in texture segmentation with state-of-the-art performance on the Prague benchmark and provides new possibilities in crystal segmentation, being robust to extreme noise and requiring no prior knowledge of the crystal structure.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"10 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123174345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Hide and seek: Uncovering facial occlusion with variable-threshold robust PCA 捉迷藏:用变阈值鲁棒PCA发现面部遮挡
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477579
W. Leow, Guodong Li, J. Lai, T. Sim, Vaishali Sharma
{"title":"Hide and seek: Uncovering facial occlusion with variable-threshold robust PCA","authors":"W. Leow, Guodong Li, J. Lai, T. Sim, Vaishali Sharma","doi":"10.1109/WACV.2016.7477579","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477579","url":null,"abstract":"Face images are very important in human social activities, which can be severely hampered when they are corrupted by occluders such as eyeglasses, face marks, and scarfs. Existing methods for removing occlusions in face images can be grouped into three broad categories, namely PCA, robust PCA (RPCA), and sparse coding. The major weaknesses of these methods are inconsistent performance across test conditions and possible corruption of unoccluded part of the recovered target image. This paper presents variable-threshold RPCA (VRPCA) method based on RPCA with variable thresholding. Comprehensive tests show that VRPCA is able to preserve the unoccluded parts of the target image with practically zero error. Compared to existing methods, it is more accurate, reliable, and consistent across various test conditions.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125928724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Accurate and efficient pulse measurement from facial videos on smartphones 通过智能手机上的面部视频进行准确高效的脉搏测量
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477669
Chong Huang, Xin Yang, K. Cheng
{"title":"Accurate and efficient pulse measurement from facial videos on smartphones","authors":"Chong Huang, Xin Yang, K. Cheng","doi":"10.1109/WACV.2016.7477669","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477669","url":null,"abstract":"Non-contact measurement of cardiac pulse signals has attracted high interests due to its convenience and cost effectiveness. However, extracting pulse signals on mobile handheld devices (e.g. smartphones) based on face videos captured by mobile cameras usually suffers from low measurement accuracy due to misalignment errors in face tracking and inevitable illumination changes in a mobile scenario, and low efficiency due to a handheld's limited computing power. We propose two techniques to address these limitations: 1) an accurate and efficient face tracking method based on an Active Shape Model (ASM) and the LDB (Local Difference Binary) feature description; 2) an adaptive temporal filtering method which can detect, and in turn denoise, sharp intensity changes in the source trace. Experimental results demonstrate that the proposed solution can achieve a speedup of 6.2X and is robust to noises in common mobile scenarios.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125970723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Monocular obstacle avoidance for blind people using probabilistic focus of expansion estimation 基于概率焦点扩展估计的盲人单眼避障
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477608
Sebastian Stabinger, A. Rodríguez-Sánchez, J. Piater
{"title":"Monocular obstacle avoidance for blind people using probabilistic focus of expansion estimation","authors":"Sebastian Stabinger, A. Rodríguez-Sánchez, J. Piater","doi":"10.1109/WACV.2016.7477608","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477608","url":null,"abstract":"Visually impaired people have a much higher chance of head injuries in daily life because of obstacles that cannot be reliably detected using conventional aids. We present part of a solution to this problem, using only one head mounted camera and optical flow techniques. As part of the system, a novel method to estimate the focus of expansion is presented, which also provides a metric for the quality of the estimate. The final result is a real time capable software system, which can detect obstacles at eye level.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"426 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123272520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Joint object recognition and pose estimation using a nonlinear view-invariant latent generative model 基于非线性视觉不变潜在生成模型的联合目标识别和姿态估计
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477655
A. Bakry, Tarek El-Gaaly, Mohamed Elhoseiny, A. Elgammal
{"title":"Joint object recognition and pose estimation using a nonlinear view-invariant latent generative model","authors":"A. Bakry, Tarek El-Gaaly, Mohamed Elhoseiny, A. Elgammal","doi":"10.1109/WACV.2016.7477655","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477655","url":null,"abstract":"Object recognition and pose estimation are two fundamental problems in the field of computer vision. Recognizing objects and their poses/viewpoints are critical components of ample vision and robotic systems. Multiple viewpoints of an object lie on an intrinsic low-dimensional manifold in the input space (i.e. descriptor space). Different objects captured from the same set of viewpoints have manifolds with a common topology. In this paper we utilize this common topology between object manifolds by learning a low-dimensional latent space which non-linearly maps between a common unified manifold and the object manifold in the input space. Using a supervised embedding approach, the latent space is computed and used to jointly infer the category and pose of objects. We empirically validate our model by using multiple inference approaches and testing on multiple challenging datasets. We compare our results with the state-of-the-art and present our increased category recognition and pose estimation accuracy.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122108208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Is alice chasing or being chased?: Determining subject and object of activities in videos 爱丽丝在追还是被追?:确定视频活动的主体和客体
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477710
Teng Zhang, Liangchen Liu, A. Wiliem, B. Lovell
{"title":"Is alice chasing or being chased?: Determining subject and object of activities in videos","authors":"Teng Zhang, Liangchen Liu, A. Wiliem, B. Lovell","doi":"10.1109/WACV.2016.7477710","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477710","url":null,"abstract":"Recent progress in video description has shown promising results by combining object/action recognition and natural language processing techniques. However, even the most simplest form of the generated sentence, the SVO triplet (Subject/Verb/Object), can be misleading for its lack of role relationship analysis. When the system detects keywords \"person\", \"baby\" and \"feed\", we do not want the system to generate \"a person feeding a baby\" when the actual screen is a scene where the baby is trying to share the food. In this paper, we explore role relationships between objects/persons and their usage in generating a more meaningful video description. More specifically, we confine ourselves on the following problem: identifying subject and object roles in two-person activities. We argue that the subject and object roles have consistent properties across different activities. To that end, we cast this problem as a domain adaptation problem. A novel Youtube SVO dataset is proposed for evaluating methods developed for this problem. The performance of the proposed method is compared against several baseline methods.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121395469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multiscale fully convolutional network with application to industrial inspection 多尺度全卷积网络在工业检测中的应用
2016 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2016-03-07 DOI: 10.1109/WACV.2016.7477595
Xiao Bian, Ser-Nam Lim, Ning Zhou
{"title":"Multiscale fully convolutional network with application to industrial inspection","authors":"Xiao Bian, Ser-Nam Lim, Ning Zhou","doi":"10.1109/WACV.2016.7477595","DOIUrl":"https://doi.org/10.1109/WACV.2016.7477595","url":null,"abstract":"In recent years, deep learning, particularly Convolutional Neural Network (CNN), has shown great efficacy for solving various vision tasks. In image segmentation, it has been demonstrated that a CNN can greatly outperform other approaches. However, special attention has to be paid towards setting various parameters in the CNN that affects the scale of the feature map generated at the last convolutional layer, where scale here refers to the ratio of the number of pixels in the original input image that correspond to each pixel in the feature map. Quite often, the optimal settings are tied to the specific problem on hand and can be fairly challenging to determine. To overcome such an issue, this paper proposes a multiscale Fully Convolutional Network (FCN) that combines networks trained at various scales, thereby allowing for conducting segmentation more generically. Moreover, such a multiscale architecture allows for incremental fine-tuning as more training images become available later on and new networks can be trained and added to the combined network. Such flexibility has great utility in applications such as industrial inspection, where training images may not be readily available initially, but yet requires a high level of accuracy. This paper will validate our findings by reporting the results that we have obtained by applying multiscale FCN to the inspection of aircraft engine part.","PeriodicalId":124363,"journal":{"name":"2016 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128106236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书