2019 IEEE Winter Conference on Applications of Computer Vision (WACV)最新文献_第9页

Digging Deeper Into Egocentric Gaze Prediction 深入挖掘以自我为中心的凝视预测

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00035

H. R. Tavakoli, Esa Rahtu, Juho Kannala, A. Borji

{"title":"Digging Deeper Into Egocentric Gaze Prediction","authors":"H. R. Tavakoli, Esa Rahtu, Juho Kannala, A. Borji","doi":"10.1109/WACV.2019.00035","DOIUrl":"https://doi.org/10.1109/WACV.2019.00035","url":null,"abstract":"This paper digs deeper into factors that influence egocentric gaze. Instead of training deep models for this purpose in a blind manner, we propose to inspect factors that contribute to gaze guidance during daily tasks. Bottom-up saliency and optical flow are assessed versus strong spatial prior baselines. Task-specific cues such as vanishing point, manipulation point, and hand regions are analyzed as representatives of top-down information. We also look into the contribution of these factors by investigating a simple recurrent neural model for ego-centric gaze prediction. First, deep features are extracted for all input video frames. Then, a gated recurrent unit is employed to integrate information over time and to predict the next fixation. We propose an integrated model that combines the recurrent model with several top-down and bottom-up cues. Extensive experiments over multiple datasets reveal that (1) spatial biases are strong in egocentric videos, (2) bottom-up attention models perform poorly in predicting gaze and underperform spatial biases, (3) deep features perform better compared to traditional features, (4) as opposed to hand regions, the manipulation point is a strong influential cue for gaze prediction, (5) combining the proposed recurrent model with bottom-up cues, vanishing points and, in particular, manipulation point results in the best gaze prediction accuracy over egocentric videos, (6) the knowledge transfer works best for cases where the tasks or sequences are similar, and (7) task and activity recognition can benefit from gaze prediction. Our findings suggest that (1) there should be more emphasis on hand-object interaction and (2) the egocentric vision community should consider larger datasets including diverse stimuli and more subjects.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126496953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Analyzing Modern Camera Response Functions 分析现代相机响应函数

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00213

Can Chen, Scott McCloskey, Jingyi Yu

引用次数: 5

Task Relation Networks 任务关系网络

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00104

Jianshu Li, Pan Zhou, Yunpeng Chen, Jian Zhao, S. Roy, Shuicheng Yan, Jiashi Feng, T. Sim

引用次数: 15

Warping-Based Stereoscopic 3D Video Retargeting With Depth Remapping 基于深度重映射的立体3D视频重定位

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00181

Md Baharul Islam, L. Wong, Kok-Lim Low, Chee-Onn Wong

引用次数: 9

Crowd Counting Using Scale-Aware Attention Networks 使用规模感知注意力网络进行人群计数

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00141

M. Hossain, M. Hosseinzadeh, Omit Chanda, Yang Wang

引用次数: 110

3D Reconstruction and Texture Optimization Using a Sparse Set of RGB-D Cameras 基于RGB-D相机稀疏集的三维重建和纹理优化

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00155

Wei Li, Xiao Xiao, J. Hahn

引用次数: 15

A Conditional Deep Generative Model of People in Natural Images 自然图像中人物的条件深度生成模型

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00159

Rodrigo de Bem, Arna Ghosh, A. Boukhayma, Thalaiyasingam Ajanthan, N. Siddharth, Philip H. S. Torr

引用次数: 20

Improving Diversity of Image Captioning Through Variational Autoencoders and Adversarial Learning 通过变分自编码器和对抗学习提高图像标题的多样性

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00034

Li Ren, Guo-Jun Qi, K. Hua

{"title":"Improving Diversity of Image Captioning Through Variational Autoencoders and Adversarial Learning","authors":"Li Ren, Guo-Jun Qi, K. Hua","doi":"10.1109/WACV.2019.00034","DOIUrl":"https://doi.org/10.1109/WACV.2019.00034","url":null,"abstract":"Learning translation from images to human-readable natural language has become a great challenge in computer vision research in recent years. Existing works explore the semantic correlation between the visual and language domains via encoder-to-decoder learning frameworks based on classifying visual features in the language domain. This approach, however, is criticized for its lacking of naturalness and diversity. In this paper, we demonstrate a novel way to learn a semantic connection between visual information and natural language directly based on a Variational Autoencoder (VAE) that is trained in an adversarial routine. Instead of using the classification based discriminator, our method directly learns to estimate the diversity between a hidden vector embedded from a text encoder and an informative feature that is sampled from a learned distribution of the autoencoders. We show that the sentences learned from this matching contains accurate semantic meaning with high diversity in the image captioning task. Our experiments on the popular MSCOCO dataset indicates that our method learns to generate high-quality natural language with competitive scores on both correctness and diversity.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116169966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Skip Residual Pairwise Networks With Learnable Comparative Functions for Few-Shot Learning 基于可学习比较函数的残差两两网络进行少次学习

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00099

A. Mehrotra, Ambedkar Dukkipati

{"title":"Skip Residual Pairwise Networks With Learnable Comparative Functions for Few-Shot Learning","authors":"A. Mehrotra, Ambedkar Dukkipati","doi":"10.1109/WACV.2019.00099","DOIUrl":"https://doi.org/10.1109/WACV.2019.00099","url":null,"abstract":"In this work we consider the ubiquitous Siamese network architecture and hypothesize that having an end-to-end learnable comparative function instead of an arbitrarily fixed one used commonly in practice (such as dot product) would allow the network to learn a final representation more suited to the task at hand and generalize better with very small quantities of data. Based on this we propose Skip Residual Pairwise Networks (SRPN) for few-shot learning based on residual Siamese networks. We validate our hypothesis by evaluating the proposed model for few-shot learning on Omniglot and mini-Imagenet datasets. Our model outperforms the residual Siamese design of equal depth and parameters. We also show that our model is competitive with state-of-the-art meta-learning based methods for few-shot learning on the challenging mini-Imagenet dataset whilst being a much simpler design, obtaining 54.4% accuracy on the five-way few-shot learning task with only a single example per class and over 70% accuracy with five examples per class. We further observe that the network weights in our model are much smaller compared to an equivalent residual Siamese Network under similar regularization, thus validating our hypothesis that our model design allows for better generalization. We also observe that our asymmetric, non-metric SRPN design automatically learns to approximate natural metric learning priors such as a symmetry and the triangle inequality.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121469679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Cross Domain Residual Transfer Learning for Person Re-Identification 人物再识别的跨域残差迁移学习

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00219

Furqan Khan, F. Brémond

引用次数: 3