2019 IEEE Winter Conference on Applications of Computer Vision (WACV)最新文献

筛选
英文 中文
Space-Time Event Clouds for Gesture Recognition: From RGB Cameras to Event Cameras 用于手势识别的时空事件云:从RGB相机到事件相机
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00199
Qinyi Wang, Yexin Zhang, Junsong Yuan, Yilong Lu
{"title":"Space-Time Event Clouds for Gesture Recognition: From RGB Cameras to Event Cameras","authors":"Qinyi Wang, Yexin Zhang, Junsong Yuan, Yilong Lu","doi":"10.1109/WACV.2019.00199","DOIUrl":"https://doi.org/10.1109/WACV.2019.00199","url":null,"abstract":"The recently developed event cameras can directly sense the motion in the scene by generating an asynchronous sequence of events, i.e., event streams, where each individual event (x, y, t) corresponds to the space-time location when a pixel sensor captures an intensity change. Compared with RGB cameras, event cameras are frameless but can capture much faster motion, therefore have great potential for recognizing gestures of fast motions. To deal with the unique output of event cameras, previous methods often treat event streams as time sequences, thus do not fully explore the space-time sparsity of the event stream data. In this work, we treat the event stream as a set of 3D points in space-time, i.e., space-time event clouds. To analyze event clouds and recognize gestures, we propose to leverage PointNet, a neural network architecture originally designed for matching and recognizing 3D point clouds. We further adapt PointNet to cater to event clouds for real-time gesture recognition. On the benchmark dataset of event camera based gesture recognition, i.e., IBM DVS128 Gesture dataset, our proposed method achieves a high accuracy of 97.08% and performs the best among existing methods.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127987481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 80
Autonomous Curiosity for Real-Time Training Onboard Robotic Agents 实时训练机载机器人代理的自主好奇心
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00163
Ervin Teng, Bob Iannucci
{"title":"Autonomous Curiosity for Real-Time Training Onboard Robotic Agents","authors":"Ervin Teng, Bob Iannucci","doi":"10.1109/WACV.2019.00163","DOIUrl":"https://doi.org/10.1109/WACV.2019.00163","url":null,"abstract":"Learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. This is especially true when a human operator is required to provide the ground truth—such a source should only be queried sparingly. In this work, we address the problem of curiosity as it relates to online, real-time, human-in-the-loop training of an object detection algorithm onboard a robotic platform, one where motion produces new views of the subject. We propose a deep reinforcement learning approach that decides when to ask the human user for ground truth, and when to move. Through a series of experiments, we demonstrate that our agent learns a movement and request policy that is at least 3x more effective at using human user interactions to train an object detector than untrained approaches, and is generalizable to a variety of subjects and environments.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131255755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Toward Computer Vision Systems That Understand Real-World Assembly Processes 迈向计算机视觉系统,了解真实世界的组装过程
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00051
Jonathan D. Jones, Gregory Hager, S. Khudanpur
{"title":"Toward Computer Vision Systems That Understand Real-World Assembly Processes","authors":"Jonathan D. Jones, Gregory Hager, S. Khudanpur","doi":"10.1109/WACV.2019.00051","DOIUrl":"https://doi.org/10.1109/WACV.2019.00051","url":null,"abstract":"Many applications of computer vision require robust systems that can parse complex structures as they evolve in time. Using a block construction task as a case study, we illustrate the main components involved in building such systems. We evaluate performance at three increasingly-detailed levels of spatial granularity on two multimodal (RGBD + IMU) datasets. On the first, designed to match the assumptions of the model, we report better than 90% accuracy at the finest level of granularity. On the second, designed to test the robustness of our model under adverse, real-world conditions, we report 67% accuracy and 91% precision at the mid-level of granularity. We show that this seemingly simple process presents many opportunities to expand the frontiers of computer vision and action recognition.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132918674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
CDNet: Single Image De-Hazing Using Unpaired Adversarial Training 使用非配对对抗训练的单幅图像去雾化
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00127
Akshay Dudhane, S. Murala
{"title":"CDNet: Single Image De-Hazing Using Unpaired Adversarial Training","authors":"Akshay Dudhane, S. Murala","doi":"10.1109/WACV.2019.00127","DOIUrl":"https://doi.org/10.1109/WACV.2019.00127","url":null,"abstract":"Outdoor scene images generally undergo visibility degradation in presence of aerosol particles such as haze, fog and smoke. The reason behind this is, aerosol particles scatter the light rays reflected from the object surface and thus results in attenuation of light intensity. Effect of haze is inversely proportional to the transmission coefficient of the scene point. Thus, estimation of accurate transmission map (TrMap) is a key step to reconstruct the haze-free scene. Previous methods used various assumptions/priors to estimate the scene TrMap. Also, available end-to-end dehazing approaches make use of supervised training to anticipate the TrMap on synthetically generated paired hazy images. Despite the success of previous approaches, they fail in real-world extreme vague conditions due to unavailability of the real-world hazy image pairs for training the network. Thus, in this paper, Cycle-consistent generative adversarial network for single image De-hazing named as CDNet is proposed which is trained in an unpaired manner on real-world hazy image dataset. Generator network of CDNet comprises of encoder-decoder architecture which aims to estimate the object level TrMap followed by optical model to recover the haze-free scene. We conduct experiments on four datasets namely: D-HAZY [1], Imagenet [5], SOTS [20] and real-world images. Structural similarity index, peak signal to noise ratio and CIEDE2000 metric are used to evaluate the performance of the proposed CDNet. Experiments on benchmark datasets show that the proposed CDNet outperforms the existing state-of-the-art methods for single image haze removal.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132201006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Defocus Magnification Using Conditional Adversarial Networks 使用条件对抗网络的离焦放大
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00147
P. Sakurikar, Ishit Mehta, P J Narayanan
{"title":"Defocus Magnification Using Conditional Adversarial Networks","authors":"P. Sakurikar, Ishit Mehta, P J Narayanan","doi":"10.1109/WACV.2019.00147","DOIUrl":"https://doi.org/10.1109/WACV.2019.00147","url":null,"abstract":"Defocus magnification is the process of rendering a shallow depth-of-field in an image captured using a camera with a narrow aperture. Defocus magnification is a useful tool in photography for emphasis on the subject and for highlighting background bokeh. Estimating the per-pixel blur kernel or the depth-map of the scene followed by spatially-varying re-blurring is the standard approach to defocus magnification. We propose a single-step approach that directly converts a narrow-aperture image to a wide-aperture image. We use a conditional adversarial network trained on multi-aperture images created from light-fields. We use a novel loss term based on a composite focus measure to improve generalization and show high quality defocus magnification.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117167955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Single-Shot Analysis of Refractive Shape Using Convolutional Neural Networks 基于卷积神经网络的单镜头折射形状分析
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00111
J. D. Stets, Zhengqin Li, J. Frisvad, Manmohan Chandraker
{"title":"Single-Shot Analysis of Refractive Shape Using Convolutional Neural Networks","authors":"J. D. Stets, Zhengqin Li, J. Frisvad, Manmohan Chandraker","doi":"10.1109/WACV.2019.00111","DOIUrl":"https://doi.org/10.1109/WACV.2019.00111","url":null,"abstract":"The appearance of a transparent object is determined by a combination of refraction and reflection, as governed by a complex function of its shape as well as the surrounding environment. Prior works on 3D reconstruction have largely ignored transparent objects due to this challenge, yet they occur frequently in real-world scenes. This paper presents an approach to estimate depths and normals for transparent objects using a single image acquired under a distant but otherwise arbitrary environment map. In particular, we use a deep convolutional neural network (CNN) for this task. Unlike opaque objects, it is challenging to acquire ground truth training data for refractive objects, thus, we propose to use a large-scale synthetic dataset. To accurately capture the image formation process, we use a physically-based renderer. We demonstrate that a CNN trained on our dataset learns to reconstruct shape and estimate segmentation boundaries for transparent objects using a single image, while also achieving generalization to real images at test time. In experiments, we extensively study the properties of our dataset and compare to baselines demonstrating its utility.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125143321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
[Copyright notice] (版权)
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/wacv.2019.00003
{"title":"[Copyright notice]","authors":"","doi":"10.1109/wacv.2019.00003","DOIUrl":"https://doi.org/10.1109/wacv.2019.00003","url":null,"abstract":"","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127831272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmenting Sky Pixels in Images: Analysis and Comparison 分割图像中的天空像素:分析和比较
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00189
Cecilia La Place, Aisha Urooj Khan, A. Borji
{"title":"Segmenting Sky Pixels in Images: Analysis and Comparison","authors":"Cecilia La Place, Aisha Urooj Khan, A. Borji","doi":"10.1109/WACV.2019.00189","DOIUrl":"https://doi.org/10.1109/WACV.2019.00189","url":null,"abstract":"This work addresses sky segmentation, the task of determining sky and non-sky pixels in images, and improving upon existing state-of-the-art models. Outdoor scene parsing models are often trained on ideal datasets and produce high-quality results. However, this leads to inferior performance when applied to real-world images. The quality of scene parsing, particularly sky segmentation, decreases in night-time images, images involving varying weather conditions, and scene changes due to seasonal weather. We address these challenges using the RefineNet model in conjunction with two datasets: SkyFinder, and a subset of the SUN database containing sky regions (SUN-sky, henceforth). We achieve an improvement of 10-15% in the average MCR compared to prior methods using the SkyFinder dataset, and nearly 36% improvement from an off-the-shelf model in terms of average mIOU score. Employing fully connected conditional random fields as a post processing method demonstrates further enhancement of our results. Furthermore, by analyzing models over images with respect to two aspects, time of day and weather conditions, we find that when facing the same challenges as prior methods, our trained models significantly outperform them.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116212407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Attentive and Adversarial Learning for Video Summarization 视频摘要的注意和对抗性学习
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00173
Tsu-Jui Fu, Shao-Heng Tai, Hwann-Tzong Chen
{"title":"Attentive and Adversarial Learning for Video Summarization","authors":"Tsu-Jui Fu, Shao-Heng Tai, Hwann-Tzong Chen","doi":"10.1109/WACV.2019.00173","DOIUrl":"https://doi.org/10.1109/WACV.2019.00173","url":null,"abstract":"This paper aims to address the video summarization problem via attention-aware and adversarial training. We formulate the problem as a sequence-to-sequence task, where the input sequence is an original video and the output sequence is its summarization. We propose a GAN-based training framework, which combines the merits of unsupervised and supervised video summarization approaches. The generator is an attention-aware Ptr-Net that generates the cutting points of summarization fragments. The discriminator is a 3D CNN classifier to judge whether a fragment is from a ground-truth or a generated summarization. The experiments show that our method achieves state-of-the-art results on SumMe, TVSum, YouTube, and LoL datasets with 1.5% to 5.6% improvements. Our Ptr-Net generator can overcome the unbalanced training-test length in the seq2seq problem, and our discriminator is effective in leveraging unpaired summarizations to achieve better performance.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121836318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Photo-Sketching: Inferring Contour Drawings From Images 照片素描:从图像推断轮廓图
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00154
Mengtian Li, Zhe L. Lin, R. Mech, Ersin Yumer, Deva Ramanan
{"title":"Photo-Sketching: Inferring Contour Drawings From Images","authors":"Mengtian Li, Zhe L. Lin, R. Mech, Ersin Yumer, Deva Ramanan","doi":"10.1109/WACV.2019.00154","DOIUrl":"https://doi.org/10.1109/WACV.2019.00154","url":null,"abstract":"Edges, boundaries and contours are important subjects of study in both computer graphics and computer vision. On one hand, they are the 2D elements that convey 3D shapes, on the other hand, they are indicative of occlusion events and thus separation of objects or semantic concepts. In this paper, we aim to generate contour drawings, boundary-like drawings that capture the outline of the visual scene. Prior art often cast this problem as boundary detection. However, the set of visual cues presented in the boundary detection output are different from the ones in contour drawings, and also the artistic style is ignored. We address these issues by collecting a new dataset of contour drawings and proposing a learning-based method that resolves diversity in the annotation and, unlike boundary detectors, can work with imperfect alignment of the annotation and the actual ground truth. Our method surpasses previous methods quantitatively and qualitatively. Surprisingly, when our model fine-tunes on BSDS500, we achieve the state-of-the-art performance in salient boundary detection, suggesting contour drawing might be a scalable alternative to boundary annotation, which at the same time is easier and more interesting for annotators to draw.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122090693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信