{"title":"Graph Grouping Loss for Metric Learning of Face Image Representations","authors":"Nakamasa Inoue","doi":"10.1109/VCIP49819.2020.9301861","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301861","url":null,"abstract":"This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different identities to be far from each other by constructing and optimizing graphs representing the relation between images. Further, to reduce the computational cost, we propose an efficient way to compute GG loss for cases where embeddings are L2 normalized. In experiments, we demonstrate the effectiveness of the proposed method for face verification on the VoxCeleb dataset. The results show that the proposed GG loss outperforms conventional losses for metric learning.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128142040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Near Infrared Colorization with Semantic Segmentation and Transfer Learning","authors":"Fengqiao Wang, Lu Liu, Cheolkon Jung","doi":"10.1109/VCIP49819.2020.9301788","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301788","url":null,"abstract":"Although near infrared (NIR) images contain no color, they have abundant and clear textures. In this paper, we propose deep NIR colorization with semantic segmentation and transfer learning. NIR images are capable of capturing invisible spectrum (700-1000 nm) that is quite different from visible spectrum images. We employ convolutional layers to build relationship between single NIR images and three-channel color images, instead of mapping to Lab or YCbCr color space. Moreover, we use semantic segmentation as global prior information to refine colorization of smooth regions for objects. We use color divergence loss to further optimize NIR colorization results with good structures and edges. Since the training dataset is not enough to capture rich color information, we adopt transfer learning to get color and semantic information. Experimental results verify that the proposed method produces a natural color image from single NIR image and outperforms state-of-the-art methods in terms of peak signal-to-noise ratio (PSNR) and structural similarity (SSIM).","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114570793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning for Photometric Redshift Estimation of Quasars with Different Samples","authors":"Yanxia Zhang, Xin Jin, Jingyi Zhang, Yongheng Zhao","doi":"10.1109/VCIP49819.2020.9301849","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301849","url":null,"abstract":"We compare the performance of Support Vector Machine, XGBoost, LightGBM, k-Nearest Neighbors, Random forests and Extra-Trees on the photometric redshift estimation of quasars based on the SDSS_WISE sample. For this sample, LightGBM shows its superiority in speed while k-Nearest Neighbors, Random forests and Extra-Trees show better performance. Then k-Nearest Neighbors, Random forests and Extra-Trees are applied on the SDSS, SDSS_WISE, SDSS_UKIDSS, WISE_UKIDSS and SDSS_WISE_UKIDSS samples. The results show that the performance of an algorithm depends on the sample selection, sample size, input pattern and information from different bands; for the same sample, the more information the better performance is obtained, but different algorithms shows different accuracy; no single algorithm shows its superiority on every sample.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126570149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bernardo Beling, Iago Storch, L. Agostini, B. Zatt, S. Bampi, D. Palomino
{"title":"ERP-Based CTU Splitting Early Termination for Intra Prediction of 360 videos","authors":"Bernardo Beling, Iago Storch, L. Agostini, B. Zatt, S. Bampi, D. Palomino","doi":"10.1109/VCIP49819.2020.9301879","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301879","url":null,"abstract":"This work presents an Equirectangular projection (ERP) based Coding Tree Unit (CTU) splitting early termination algorithm for the High Efficiency Video Coding (HEVC) intra prediction of 360-degree videos. The proposed algorithm adaptively employs early termination in the HEVC CTU splitting based on distortion properties of the ERP projection, that generate homogeneous regions at the top and bottom portion of a video frame. Experimental results show an average of 24% time saving with 0.11% coding efficiency loss, significantly reducing the encoding complexity with minor impacts in the encoding efficiency. Besides, solution presents the best results considering the relation between time saving and coding efficiency when compared with all related works.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127806240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stereoscopic image reflection removal based on Wasserstein Generative Adversarial Network","authors":"Xiuyuan Wang, Yikun Pan, D. Lun","doi":"10.1109/VCIP49819.2020.9301892","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301892","url":null,"abstract":"Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new background edge estimation algorithm based on the Wasserstein Generative Adversarial Network (WGAN) is proposed to distinguish the edges of the background image from the reflection. The background edges are then used to reconstruct the background image. We compare the proposed approach with the state-of-the- art reflection removal methods. Results show that the proposed approach can outperform the traditional single-image based methods and is comparable to the multiple-image based approach while having a much simpler imaging hardware requirement.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129436681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Learning-Based Nonlinear Transform for HEVC Intra Coding","authors":"Kun-Min Yang, Dong Liu, Feng Wu","doi":"10.1109/VCIP49819.2020.9301790","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301790","url":null,"abstract":"In the hybrid video coding framework, transform is adopted to exploit the dependency within the input signal. In this paper, we propose a deep learning-based nonlinear transform for intra coding. Specifically, we incorporate the directional information into the residual domain. Then, a convolutional neural network model is designed to achieve better decorrelation and energy compaction than the conventional discrete cosine transform. This work has two main contributions. First, we propose to use the intra prediction signal to reduce the directionality in the residual. Second, we present a novel loss function to characterize the efficiency of the transform during the training. To evaluate the compression performance of the proposed transform, we implement it into the High Efficiency Video Coding reference software. Experimental results demonstrate that the proposed method achieves up to 1.79% BD-rate reduction for natural videos.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132902787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuge Zhang, Min Zhao, Longbin Yan, Tiande Gao, Jie Chen
{"title":"CNN-Based Anomaly Detection For Face Presentation Attack Detection With Multi-Channel Images","authors":"Yuge Zhang, Min Zhao, Longbin Yan, Tiande Gao, Jie Chen","doi":"10.1109/VCIP49819.2020.9301818","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301818","url":null,"abstract":"Recently, face recognition systems have received significant attention, and there have been many works focused on presentation attacks (PAs). However, the generalization capacity of PAs is still challenging in real scenarios, as the attack samples in the training database may not cover all possible PAs. In this paper, we propose to perform the face presentation attack detection (PAD) with multi-channel images using the convolutional neural network based anomaly detection. Multi-channel images endow us with rich information to distinguish between different mode of attacks, and the anomaly detection based technique ensures the generalization performance. We evaluate the performance of our methods using the wide multi-channel presentation attack (WMCA) dataset.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134235938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Low to Super Resolution and Beyond","authors":"C. Kok, Wing-Shan Tam","doi":"10.1109/VCIP49819.2020.9301878","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301878","url":null,"abstract":"The tutorial starts with an introduction of digital image interpolation, and single image super-resolution. It continues with the definition of various image interpolation performance measurement indices, including both objective and subjective indices. The core of this tutorial is the application of covariance based interpolation to achieve high visual quality image interpolation and single image super-resolution results. Layer on layer, the covariance based edge-directed image interpolation techniques that makes use of stochastic image model without explicit edge map, to iterative covariance correction based image interpolation. The edge based interpolation incorporated human visual system to achieve visually pleasant high resolution interpolation results. On each layer, the pros and cons of each image model and interpolation technique, solutions to alleviate the interpolation visual artifacts of each techniques, and innovative modification to overcome limitations of traditional edge-directed image interpolation techniques are presented in this tutorial, which includes: spatial adaptive pixel intensity estimation, pixel intensity correction, error propagation mitigation, covariance windows adaptation, and iterative covariance correction. The tutorial will extend from theoretical and analytical discussions to detail implementation using MATLAB. The audience shall be able to bring home with implementation details, as well as the performance and complexity of the interpolation algorithms discussed in this tutorial.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131867500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning the Connectivity: Situational Graph Convolution Network for Facial Expression Recognition","authors":"Jinzhao Zhou, Xingming Zhang, Yang Liu","doi":"10.1109/VCIP49819.2020.9301773","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301773","url":null,"abstract":"Previous studies recognizing expressions with facial graph topology mostly use a fixed facial graph structure established by the physical dependencies among facial landmarks. However, the static graph structure inherently lacks flexibility in non-standardized scenarios. This paper proposes a dynamic-graph-based method for effective and robust facial expression recognition. To capture action-specific dependencies among facial components, we introduce a link inference structure, called the Situational Link Generation Module (SLGM). We further propose the Situational Graph Convolution Network (SGCN) to automatically detect and recognize facial expression in various conditions. Experimental evaluations on two lab-constrained datasets, CK+ and Oulu, along with an in-the-wild dataset, AFEW, show the superior performance of the proposed method. Additional experiments on occluded facial images further demonstrate the robustness of our strategy.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122345591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Cai, Yuli Fu, Youjun Xiang, Tao Zhu, Xianfeng Li, Huanqiang Zeng
{"title":"Fast compressed sensing recovery using generative models and sparse deviations modeling","authors":"Lei Cai, Yuli Fu, Youjun Xiang, Tao Zhu, Xianfeng Li, Huanqiang Zeng","doi":"10.1109/VCIP49819.2020.9301808","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301808","url":null,"abstract":"This paper develops an algorithm to effectively explore the advantages of both sparse vector recovery methods and generative model-based recovery methods for solving compressed sensing recovery problem. The proposed algorithm mainly consists of two steps. In the first step, a network-based projected gradient descent (NPGD) is introduced to solve a non-convex optimization problem, obtaining a preliminary recovery of the original signal. Then with the obtained preliminary recovery, a l1 norm regularized optimization problem is solved by optimizing for sparse deviation vectors. Experimental results on two bench-mark datasets for image compressed sensing clearly demonstrate that the proposed recovery algorithm can bring about high computation speed, while decreasing the reconstruction error continuously with increasing the number of measurements.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127496626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}