2019 IEEE Winter Conference on Applications of Computer Vision (WACV)最新文献

Ancient Painting to Natural Image: A New Solution for Painting Processing 古代绘画到自然形象:绘画加工的新解决方案

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-02 DOI: 10.1109/WACV.2019.00061

Tingting Qiao, Weijing Zhang, Miao Zhang, Zixuan Ma, Duanqing Xu

{"title":"Ancient Painting to Natural Image: A New Solution for Painting Processing","authors":"Tingting Qiao, Weijing Zhang, Miao Zhang, Zixuan Ma, Duanqing Xu","doi":"10.1109/WACV.2019.00061","DOIUrl":"https://doi.org/10.1109/WACV.2019.00061","url":null,"abstract":"Collecting a large-scale and well-annotated dataset for image processing has become a common practice in computer vision. However, in the ancient painting area, this task is not practical as the number of paintings is limited and their style is greatly diverse. We, therefore, propose a novel solution for the problems that come with ancient painting processing. This is to use domain transfer to convert ancient paintings to photo-realistic natural images. By doing so, the \"ancient painting processing problems\" become \"natural image processing problems\" and models trained on natural images can be directly applied to the transferred paintings. Specifically, we focus on Chinese ancient flower, bird and landscape paintings in this work. A novel Domain Style Transfer Network (DSTN) is proposed to transfer ancient paintings to natural images which employ a compound loss to ensure that the transferred paintings still maintain the color composition and content of the input paintings. The experiment results show that the transferred paintings generated by the DSTN have a better performance in both the human perceptual test and other image processing tasks than other state-of-arts methods, indicating the authenticity of the transferred paintings and the superiority of the proposed method.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"41 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133021378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

TextCaps: Handwritten Character Recognition With Very Small Datasets 用非常小的数据集手写字符识别

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00033

Vinoj Jayasundara, S. Jayasekara, Hirunima Jayasekara, Jathushan Rajasegaran, Suranga Seneviratne, R. Rodrigo

{"title":"TextCaps: Handwritten Character Recognition With Very Small Datasets","authors":"Vinoj Jayasundara, S. Jayasekara, Hirunima Jayasekara, Jathushan Rajasegaran, Suranga Seneviratne, R. Rodrigo","doi":"10.1109/WACV.2019.00033","DOIUrl":"https://doi.org/10.1109/WACV.2019.00033","url":null,"abstract":"Many localized languages struggle to reap the benefits of recent advancements in character recognition systems due to the lack of substantial amount of labeled training data. This is due to the difficulty in generating large amounts of labeled data for such languages and inability of deep learning techniques to properly learn from small number of training samples. We solve this problem by introducing a technique of generating new training samples from the existing samples, with realistic augmentations which reflect actual variations that are present in human hand writing, by adding random controlled noise to their corresponding instantiation parameters. Our results with a mere 200 training samples per class surpass existing character recognition results in the EMNIST-letter dataset while achieving the existing results in the three datasets: EMNIST-balanced, EMNIST-digits, and MNIST. We also develop a strategy to effectively use a combination of loss functions to improve reconstructions. Our system is useful in character recognition for localized languages that lack much labeled training data and even in other related more general contexts such as object recognition.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121956299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

High-Speed Video from Asynchronous Camera Array 异步摄像机阵列的高速视频

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00237

Si Lu

{"title":"High-Speed Video from Asynchronous Camera Array","authors":"Si Lu","doi":"10.1109/WACV.2019.00237","DOIUrl":"https://doi.org/10.1109/WACV.2019.00237","url":null,"abstract":"This paper presents a method for capturing high-speed video using an asynchronous camera array. Our method sequentially fires each sensor in a camera array with a small time offset and assembles captured frames into a high-speed video according to the time stamps. The resulting video, however, suffers from parallax jittering caused by the viewpoint difference among sensors in the camera array. To address this problem, we develop a dedicated novel view synthesis algorithm that transforms the video frames as if they were captured by a single reference sensor. Specifically, for any frame from a non-reference sensor, we find the two temporally neighboring frames captured by the reference sensor. Using these three frames, we render a new frame with the same time stamp as the non-reference frame but from the viewpoint of the reference sensor. Specifically, we segment these frames into super-pixels and then apply local content-preserving warping to warp them to form the new frame. We employ a multi-label Markov Random Field method to blend these warped frames. Our experiments show that our method can produce high-quality and high-speed video of a wide variety of scenes with large parallax, scene dynamics, and camera motion and outperforms several baseline and state-of-the-art approaches.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130389107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Attentive Conditional Channel-Recurrent Autoencoding for Attribute-Conditioned Face Synthesis 面向属性条件人脸合成的注意条件通道递归自编码

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00168

Wenling Shang, Kihyuk Sohn

{"title":"Attentive Conditional Channel-Recurrent Autoencoding for Attribute-Conditioned Face Synthesis","authors":"Wenling Shang, Kihyuk Sohn","doi":"10.1109/WACV.2019.00168","DOIUrl":"https://doi.org/10.1109/WACV.2019.00168","url":null,"abstract":"Attribute-conditioned face synthesis has many potential use cases, such as to aid the identification of a suspect or a missing person. Building on top of a conditional version of VAE-GAN, we augment the pathways connecting the latent space with channel-recurrent architecture, in order to provide not only improved generation qualities but also interpretable high-level features. In particular, to better achieve the latter, we further propose an attention mechanism over each attribute to indicate the specific latent subset responsible for its modulation. Thanks to the latent semantics formed via the channel-recurreny, we envision a tool that takes the desired attributes as inputs and then performs a 2-stage general-to-specific generation of diverse and realistic faces. Lastly, we incorporate the progressive-growth training scheme to the inference, generation and discriminator networks of our models to facilitate higher resolution outputs. Evaluations are performed through both qualitative visual examination and quantitative metrics, namely inception scores, human preferences, and attribute classification accuracy.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126271720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

VelocityGAN: Subsurface Velocity Image Estimation Using Conditional Adversarial Networks 使用条件对抗网络的地下速度图像估计

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00080

Zhongping Zhang, Yue Wu, Zheng Zhou, Youzuo Lin

{"title":"VelocityGAN: Subsurface Velocity Image Estimation Using Conditional Adversarial Networks","authors":"Zhongping Zhang, Yue Wu, Zheng Zhou, Youzuo Lin","doi":"10.1109/WACV.2019.00080","DOIUrl":"https://doi.org/10.1109/WACV.2019.00080","url":null,"abstract":"Acoustic-and elastic-waveform inversion is an important and widely used method to reconstruct subsurface velocity image. Waveform inversion is a typical non-linear and ill-posed inverse problem. Existing physics-driven computational methods for solving waveform inversion suffer from the cycle skipping and local minima issues, and not to mention solving waveform inversion is computationally expensive. In this paper, we developed a real-time datadriven technique, VelocityGAN, to accurately reconstruct subsurface velocities. Our VelocityGAN is an end-to-end framework which can generate high-quality velocity images directly from the raw seismic waveform data. A series of experiments are conducted on the synthetic seismic reflection data to evaluate the effectiveness and efficiency of VelocityGAN. We not only compare it with existing physics-driven approaches but also choose some deep learning frameworks as our data-driven baselines. The experiment results show that VelocityGAN outperforms the physics-driven waveform inversion methods and achieves the state-of-the-art performance among data-driven baselines.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121930019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Taylor Convolutional Networks for Image Classification 图像分类的泰勒卷积网络

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00140

Xiaodi Wang, Ce Li, Yipeng Mou, Baochang Zhang, J. Han, Jianzhuang Liu

引用次数: 1

On Measuring the Iconicity of a Face 论测量人脸的象似性

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00231

Prithviraj Dhar, C. Castillo, R. Chellappa

引用次数: 9

Improving 3D Human Pose Estimation Via 3D Part Affinity Fields 利用三维零件亲和场改进三维人体姿态估计

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00112

Ding Liu, Zixu Zhao, Xinchao Wang, Yuxiao Hu, Lei Zhang, Thomas Huang

{"title":"Improving 3D Human Pose Estimation Via 3D Part Affinity Fields","authors":"Ding Liu, Zixu Zhao, Xinchao Wang, Yuxiao Hu, Lei Zhang, Thomas Huang","doi":"10.1109/WACV.2019.00112","DOIUrl":"https://doi.org/10.1109/WACV.2019.00112","url":null,"abstract":"3D human pose estimation from monocular images has become a heated area in computer vision recently. For years, most deep neural network based practices have adopted either an end-to-end approach, or a two-stage approach. An end-to-end network typically estimates 3D human poses directly from 2D input images, but it suffers from the shortage of 3D human pose data. It is also obscure to know if the inaccuracy stems from limited visual under-standing or 2D-to-3D mapping. Whereas a two-stage directly lifts those 2D keypoint outputs to the 3D space, after utilizing an existing network for 2D keypoint detections. However, they tend to ignore some useful contextual hints from the 2D raw image pixels. In this paper, we introduce a two-stage architecture that can eliminate the main disadvantages of both these approaches. During the first stage we use an existing state-of-the-art detector to estimate 2D poses. To add more con-textual information to help lifting 2D poses to 3D poses, we propose 3D Part Affinity Fields (3D-PAFs). We use 3D-PAFs to infer 3D limb vectors, and combine them with 2D poses to regress the 3D coordinates. We trained and tested our proposed framework on Human3.6M, the most popular 3D human pose benchmark dataset. Our approach achieves the state-of-the-art performance, which proves that with right selections of contextual information, a simple regression model can be very powerful in estimating 3D poses.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132453339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Euclidean Invariant Recognition of 2D Shapes Using Histograms of Magnitudes of Local Fourier-Mellin Descriptors 基于局部傅里叶-梅林描述符的直方图的二维形状欧几里得不变性识别

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00038

Xinhua Zhang, L. Williams

{"title":"Euclidean Invariant Recognition of 2D Shapes Using Histograms of Magnitudes of Local Fourier-Mellin Descriptors","authors":"Xinhua Zhang, L. Williams","doi":"10.1109/WACV.2019.00038","DOIUrl":"https://doi.org/10.1109/WACV.2019.00038","url":null,"abstract":"Because the magnitude of inner products with its basis functions are invariant to rotation and scale change, the Fourier-Mellin transform has long been used as a component in Euclidean invariant 2D shape recognition systems. Yet Fourier-Mellin transform magnitudes are only invariant to rotation and scale changes about a known center point, and full Euclidean invariant shape recognition is not possible except when this center point can be consistently and accurately identified. In this paper, we describe a system where a Fourier-Mellin transform is computed at every point in the image. The spatial support of the Fourier-Mellin basis functions is made local by multiplying them with a polynomial envelope. Significantly, the magnitudes of convolutions with these complex filters at isolated points are not (by themselves) used as features for Euclidean invariant shape recognition because reliable discrimination would require filters with spatial support large enough to fully encompass the shapes. Instead, we rely on the fact that normalized histograms of magnitudes are fully Euclidean invariant. We demonstrate a system based on the VLAD machine learning method that performs Euclidean invariant recognition of 2D shapes and requires an order of magnitude less training data than comparable methods based on convolutional neural networks.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128886193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the Importance of Feature Aggregation for Face Reconstruction 论特征聚合在人脸重构中的重要性

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI: 10.1109/WACV.2019.00103

Xiang Xu, Ha A. Le, I. Kakadiaris

{"title":"On the Importance of Feature Aggregation for Face Reconstruction","authors":"Xiang Xu, Ha A. Le, I. Kakadiaris","doi":"10.1109/WACV.2019.00103","DOIUrl":"https://doi.org/10.1109/WACV.2019.00103","url":null,"abstract":"The goal of this work is to seek principles of designing a deep neural network for 3D face reconstruction from a single image. To make the evaluation simple, we generated a synthetic dataset and used it for evaluation. We conducted extensive experiments using an end-to-end face reconstruction algorithm using E2FAR and its variations, and analyzed the reason why it can be successfully applied for 3D face reconstruction. From the comparative studies, we conclude that feature aggregation from different layers is a key point to training better neural networks for 3D face reconstruction. Based on these observations, a face reconstruction feature aggregation network (FR-FAN) is proposed, which obtains significant improvements compared with baselines on the synthetic validation set. We evaluate our model on existing popular indoor and in-the-wild 2D-3D datasets. Extensive experiments demonstrate that FR-FAN performs 16.50% and 9.54% better than E2FAR on BU-3DFE and JNU-3D, respectively. Finally, the sensitivity analysis we performed on controlled datasets demonstrates that our designed network is robust to large variations of pose, illumination, and expressions.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"30 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120820958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3