2019 IEEE Winter Conference on Applications of Computer Vision (WACV)最新文献

筛选
英文 中文
[Title page i] [标题页i]
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/wacv.2019.00001
{"title":"[Title page i]","authors":"","doi":"10.1109/wacv.2019.00001","DOIUrl":"https://doi.org/10.1109/wacv.2019.00001","url":null,"abstract":"","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134061331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-Supervised Convolutional Neural Networks for In-Situ Video Monitoring of Selective Laser Melting 半监督卷积神经网络在选择性激光熔化现场视频监控中的应用
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00084
Bodi Yuan, B. Giera, G. Guss, Ibo Matthews, Sara McMains
{"title":"Semi-Supervised Convolutional Neural Networks for In-Situ Video Monitoring of Selective Laser Melting","authors":"Bodi Yuan, B. Giera, G. Guss, Ibo Matthews, Sara McMains","doi":"10.1109/WACV.2019.00084","DOIUrl":"https://doi.org/10.1109/WACV.2019.00084","url":null,"abstract":"Selective Laser Melting (SLM) is a metal additive manufacturing technique. The lack of SLM process repeatability is a barrier for industrial progression. SLM product quality is hard to control, even when using fixed system settings. Thus SLM could benefit from a monitoring system that provides quality assessments in real-time. Since there is no publicly available SLM dataset, we ran experiments to collect over one thousand videos, measured the physical output via height map images, and applied a proposed image processing algorithm to them to produce a dataset for semi-supervised learning. Then we trained convolutional neural networks (CNNs) to recognize desired quality metrics from videos. Experimental results demonstrate our the effectiveness of our proposed monitoring approach and also show that the semi-supervised model can mitigate the time and expense of labeling an entire SLM dataset.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"56 20","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134506449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Single Image Deblurring and Camera Motion Estimation With Depth Map 单幅图像去模糊和相机运动估计与深度图
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00229
Liyuan Pan, Yuchao Dai, Miaomiao Liu
{"title":"Single Image Deblurring and Camera Motion Estimation With Depth Map","authors":"Liyuan Pan, Yuchao Dai, Miaomiao Liu","doi":"10.1109/WACV.2019.00229","DOIUrl":"https://doi.org/10.1109/WACV.2019.00229","url":null,"abstract":"Camera shake during exposure is a major problem in hand-held photography, as it causes image blur that destroys details in the captured images. In the real world, such blur is mainly caused by both the camera motion and the complex scene structure. While considerable existing approaches have been proposed based on various assumptions regarding the scene structure or the camera motion, few existing methods could handle the real 6 DoF camera motion. In this paper, we propose to jointly estimate the 6 DoF camera motion and remove the non-uniform blur caused by camera motion by exploiting their underlying geometric relationships, with a single blurry image and its depth map (either direct depth measurements, or a learned depth map) as input. We formulate our joint deblurring and 6 DoF camera motion estimation as an energy minimization problem which is solved in an alternative manner. Our model enables the recovery of the 6 DoF camera motion and the latent clean image, which could also achieve the goal of generating a sharp sequence from a single blurry image. Experiments on challenging real-world and synthetic datasets demonstrate that image blur from camera shake can be well addressed within our proposed framework.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130492688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Semantic Matching by Weakly Supervised 2D Point Set Registration 基于弱监督二维点集配准的语义匹配
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00118
Zakaria Laskar, H. R. Tavakoli, Juho Kannala
{"title":"Semantic Matching by Weakly Supervised 2D Point Set Registration","authors":"Zakaria Laskar, H. R. Tavakoli, Juho Kannala","doi":"10.1109/WACV.2019.00118","DOIUrl":"https://doi.org/10.1109/WACV.2019.00118","url":null,"abstract":"In this paper we address the problem of establishing correspondences between different instances of the same object. The problem is posed as finding the geometric transformation that aligns a given image pair. We use a convolutional neural network (CNN) to directly regress the parameters of the transformation model. The alignment problem is defined in the setting where an unordered set of semantic key-points per image are available, but, without the correspondence information. To this end we propose a novel loss function based on cyclic consistency that solves this 2D point set registration problem by inferring the optimal geometric transformation model parameters. We train and test our approach on a standard benchmark dataset Proposal-Flow (PF-PASCAL). The proposed approach achieves state-of-the-art results demonstrating the effectiveness of the method. In addition, we show our approach further benefits from additional training samples in PF-PASCAL generated by using category level information.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115411293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Deep Representation Learning Characterized by Inter-Class Separation for Image Clustering 基于类间分离的图像聚类深度表征学习
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00072
Dipanjan Das, Ratul Ghosh, B. Bhowmick
{"title":"Deep Representation Learning Characterized by Inter-Class Separation for Image Clustering","authors":"Dipanjan Das, Ratul Ghosh, B. Bhowmick","doi":"10.1109/WACV.2019.00072","DOIUrl":"https://doi.org/10.1109/WACV.2019.00072","url":null,"abstract":"Despite significant advances in clustering methods in recent years, the outcome of clustering of a natural image dataset is still unsatisfactory due to two important drawbacks. Firstly, clustering of images needs a good feature representation of an image and secondly, we need a robust method which can discriminate these features for making them belonging to different clusters such that intra-class variance is less and inter-class variance is high. Often these two aspects are dealt with independently and thus the features are not sufficient enough to partition the data meaningfully. In this paper, we propose a method where we discover these features required for the separation of the images using deep autoencoder. Our method learns the image representation features automatically for the purpose of clustering and also select a coherent image and an incoherent image simultaneously for a given image so that the feature representation learning can learn better discriminative features for grouping the similar images in a cluster and at the same time separating the dissimilar images across clusters. Experiment results show that our method produces significantly better result than the state-of-the-art methods and we also show that our method is more generalized across different dataset without using any pre-trained model like other existing methods.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123163860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Hierarchical Grocery Store Image Dataset With Visual and Semantic Labels 具有视觉和语义标签的分层杂货店图像数据集
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00058
Marcus Klasson, Cheng Zhang, H. Kjellström
{"title":"A Hierarchical Grocery Store Image Dataset With Visual and Semantic Labels","authors":"Marcus Klasson, Cheng Zhang, H. Kjellström","doi":"10.1109/WACV.2019.00058","DOIUrl":"https://doi.org/10.1109/WACV.2019.00058","url":null,"abstract":"Image classification models built into visual support systems and other assistive devices need to provide accurate predictions about their environment. We focus on an application of assistive technology for people with visual impairments, for daily activities such as shopping or cooking. In this paper, we provide a new benchmark dataset for a challenging task in this application - classification of fruits, vegetables, and refrigerated products, e.g. milk packages and juice cartons, in grocery stores. To enable the learning process to utilize multiple sources of structured information, this dataset not only contains a large volume of natural images but also includes the corresponding information of the product from an online shopping website. Such information encompasses the hierarchical structure of the object classes, as well as an iconic image of each type of object. This dataset can be used to train and evaluate image classification models for helping visually impaired people in natural environments. Additionally, we provide benchmark results evaluated on pretrained convolutional neural networks often used for image understanding purposes, and also a multi-view variational autoencoder, which is capable of utilizing the rich product information in the dataset.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127581265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
FuturePose - Mixed Reality Martial Arts Training Using Real-Time 3D Human Pose Forecasting With a RGB Camera FuturePose -混合现实武术训练使用实时3D人体姿态预测与RGB相机
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00152
Erwin Wu, H. Koike
{"title":"FuturePose - Mixed Reality Martial Arts Training Using Real-Time 3D Human Pose Forecasting With a RGB Camera","authors":"Erwin Wu, H. Koike","doi":"10.1109/WACV.2019.00152","DOIUrl":"https://doi.org/10.1109/WACV.2019.00152","url":null,"abstract":"In this paper, we propose a novel mixed reality martial arts training system using deep learning based real-time human pose forecasting. Our training system is based on 3D pose estimation using a residual neural network with input from a RGB camera, which captures the motion of a trainer. The student wearing a head mounted display can see the virtual model of the trainer and his forecasted future pose. The pose forecasting is based on recurrent networks, to improve the learning quantity of the motion's temporal feature, we use a special lattice optical flow method for the joints movement estimation. We visualize the real-time human motion by a generated human model while the forecasted pose is shown by a red skeleton model. In our experiments, we evaluated the performance of our system when predicting 15 frames ahead in a 30-fps video (0.5s forecasting), the accuracies were acceptable since they are equal to or even outperforms some methods using depth IR cameras or fabric technologies, user studies showed that our system is helpful for beginners to understand martial arts and the usability is comfortable since the motions were captured by RGB camera.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117096064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity, Representation, Coverage and Importance 揭开多面视频摘要的神秘面纱:多样性、代表性、覆盖面和重要性之间的权衡
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00054
Vishal Kaushal, Rishabh K. Iyer, Khoshrav Doctor, Anurag Sahoo, P. Dubal, S. Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan
{"title":"Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity, Representation, Coverage and Importance","authors":"Vishal Kaushal, Rishabh K. Iyer, Khoshrav Doctor, Anurag Sahoo, P. Dubal, S. Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan","doi":"10.1109/WACV.2019.00054","DOIUrl":"https://doi.org/10.1109/WACV.2019.00054","url":null,"abstract":"This paper addresses automatic summarization of videos in a unified manner. In particular, we propose a framework for multi-faceted summarization for extractive, query base and entity summarization (summarization at the level of entities like objects, scenes, humans and faces in the video). We investigate several summarization models which capture notions of diversity, coverage, representation and importance, and argue the utility of these different models depending on the application. While most of the prior work on submodular summarization approaches has focused on combining several models and learning weighted mixtures, we focus on the explainability of different models and featurizations, and how they apply to different domains. We also provide implementation details on summarization systems and the different modalities involved. We hope that the study from this paper will give insights into practitioners to appropriately choose the right summarization models for the problems at hand.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114900008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Where to Focus on for Human Action Recognition? 人类行为识别的重点在哪里?
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00015
Srijan Das, Arpit Chaudhary, F. Brémond, M. Thonnat
{"title":"Where to Focus on for Human Action Recognition?","authors":"Srijan Das, Arpit Chaudhary, F. Brémond, M. Thonnat","doi":"10.1109/WACV.2019.00015","DOIUrl":"https://doi.org/10.1109/WACV.2019.00015","url":null,"abstract":"In this paper, we present a new attention model for the recognition of human action from RGB-D videos. We propose an attention mechanism based on 3D articulated pose. The objective is to focus on the most relevant body parts involved in the action. For action classification, we propose a classification network compounded of spatio-temporal subnetworks modeling the appearance of human body parts and RNN attention subnetwork implementing our attention mechanism. Furthermore, we train our proposed network end-to-end using a regularized cross-entropy loss, leading to a joint training of the RNN delivering attention globally to the whole set of spatio-temporal features, extracted from 3D ConvNets. Our method outperforms the State-of-the-art methods on the largest human activity recognition dataset available to-date (NTU RGB+D Dataset) which is also multi-views and on a human action recognition dataset with object interaction (Northwestern-UCLA Multiview Action 3D Dataset).","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126855722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Resultant Based Incremental Recovery of Camera Pose From Pairwise Matches 基于结果的相机姿势从成对匹配中增量恢复
2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00120
Y. Kasten, M. Galun, R. Basri
{"title":"Resultant Based Incremental Recovery of Camera Pose From Pairwise Matches","authors":"Y. Kasten, M. Galun, R. Basri","doi":"10.1109/WACV.2019.00120","DOIUrl":"https://doi.org/10.1109/WACV.2019.00120","url":null,"abstract":"Incremental (online) structure from motion pipelines seek to recover the camera matrix associated with an image I_n given n-1 images, I_1,...,I_n-1, whose camera matrices have already been recovered. In this paper, we introduce a novel solution to the six-point online algorithm to recover the exterior parameters associated with I_n. Our algorithm uses just six corresponding pairs of 2D points, extracted each from I_n and from any of the preceding n-1 images, allowing the recovery of the full six degrees of freedom of the n'th camera, and unlike common methods, does not require tracking feature points in three or more images. Our novel solution is based on constructing a Dixon resultant, yielding a solution method that is both efficient and accurate compared to existing solutions. We further use Bernstein's theorem to prove a tight bound on the number of complex solutions. Our experiments demonstrate the utility of our approach.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126461420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信