{"title":"Learning Photometric Stereo via Manifold-based Mapping","authors":"Yakun Ju, Muwei Jian, Junyu Dong, K. Lam","doi":"10.1109/VCIP49819.2020.9301860","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301860","url":null,"abstract":"Three-dimensional reconstruction technologies are fundamental problems in computer vision. Photometric stereo recovers the surface normals of a 3D object from varying shading cues, prevailing in its capability for generating fine surface normal. In recent years, deep learning-based photometric stereo methods are capable of improving the surface-normal estimation under general non-Lambertian surfaces, due to its powerful fitting ability on the non-Lambertian surface. These state-of-the-art methods however usually regress the surface normal directly from the high-dimensional features, without exploring the embedded structural information. This results in the underutilization of the information available in the features. Therefore, in this paper, we propose an efficient manifold-based framework for learning-based photometric stereo, which can better map combined high-dimensional feature spaces to low-dimensional manifolds. Extensive experiments show that our method, learning with the low-dimensional manifolds, achieves more accurate surface-normal estimation, outperforming other state-of-the-art methods on the challenging DiLiGenT benchmark dataset.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124545585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power/QoS-Adaptive HEVC FME Hardware using Machine Learning-Based Approximation Control","authors":"Wagner Penny, D. Palomino, M. Porto, B. Zatt","doi":"10.1109/VCIP49819.2020.9301797","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301797","url":null,"abstract":"This paper presents a machine learning-based adaptive approximate hardware design targeting the fractional motion estimation (FME) of HEVC encoder. Hardware designs targeting multiple levels of approximation are proposed, by changing FME filters coefficients and/or discarding taps. The level of approximation is defined by a decision tree, generated taking into account the behavior of several parameters of the encoding in order to predict homogeneous blocks, more suitable for more aggressive approximation without significant losses on quality of service (QoS). Instead of applying a specific level of approximation over the full video, different approximate FME accelerators are dynamically selected. Such a strategy is able to provide up to 50.54% of power reduction while keeping the QoS losses at 1.18% BD-BR.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128998273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sensitivity-Aware Bit Allocation for Intermediate Deep Feature Compression","authors":"Yuzhang Hu, Sifeng Xia, Wenhan Yang, Jiaying Liu","doi":"10.1109/VCIP49819.2020.9301807","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301807","url":null,"abstract":"In this paper, we focus on compressing and trans-mitting deep intermediate features to support the prosperous applications at the cloud side efficiently, and propose a sensitivity-aware bit allocation algorithm for the deep intermediate feature compression. Considering that different channels’ contributions to the final inference result of the deep learning model might differ a lot, we design a channel-wise bit allocation mechanism to maintain the accuracy while trying to reduce the bit-rate cost. The algorithm consists of two passes. In the first pass, only one channel is exposed to compression degradation while other channels are kept as the original ones in order to test this channel’s sensitivity to the compression degradation. This process will be repeated until all channels’ sensitivity is obtained. Then, in the second pass, bits allocated to each channel will be automatically decided according to the sensitivity obtained in the first pass to make sure that the channel with higher sensitivity can be allocated with more bits to maintain accuracy as much as possible. With the well-designed algorithm, our method surpasses state-of-the-art compression tools with on average 6.4% BD-rate saving.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128664469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Kathariya, Li Li, Zhu Li, Ling-yu Duan, Shan Liu
{"title":"Network Update Compression for Federated Learning","authors":"B. Kathariya, Li Li, Zhu Li, Ling-yu Duan, Shan Liu","doi":"10.1109/VCIP49819.2020.9301815","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301815","url":null,"abstract":"In federated learning setting, models are trained in a variety of edge-devices with locally generated data and each round only updates in the current model rather than the model itself are sent to the server where they are aggregated to compose an improved model. These edge devices, however, reside in highly uneven nature of network with higher latency and lower-throughput connections and are intermittently available for training. In addition, a network connection has an asymmetric nature of downlink and uplink. All these contribute to a major challenge while synchronizing these updates to the server.In this work, we proposed an efficient c oding s olution to significantly r educe u plink c ommunication c ost b y r educing the total number of parameters required for updates. This was achieved by applying Gaussian Mixture Model (GMM) to localize Karhunen–Loève Transform (KLT) on inter-model subspace and representing it with two low-rank matrices. Experiments on convolutional neural network (CNN) models showed the proposed model can significantly reduce the uplink communication cost in federated learning while preserving reasonable accuracy.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115530634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong-Jheng Jhu, Xiaoyu Xiu, Yi-Wen Chen, Tsung-Chuan Ma, Xianglin Wang
{"title":"Adaptive Color Transform in VVC Standard","authors":"Hong-Jheng Jhu, Xiaoyu Xiu, Yi-Wen Chen, Tsung-Chuan Ma, Xianglin Wang","doi":"10.1109/VCIP49819.2020.9301798","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301798","url":null,"abstract":"This paper provides an in-depth overview of the adaptive color transform (ACT) tool that is adopted into the emerging versatile video coding (VVC) standard. With the ACT, prediction residuals in the original color space are adaptively converted into another color space to reduce the correlation among the three color components of video sequences in 4:4:4 chroma format. The residuals after color space conversion are then transformed, quantized and entropy-coded, following the VVC framework. YCgCo-R transforms, which can be easily implemented with shift and addition operations, are selected as the ACT core transforms to do the color space conversion. Additionally, to facilitate its implementations, the ACT is disabled in certain cases where the three color components do not share the same block partition, e.g. under separate tree partition mode or intra sub-partition prediction mode. Simulation results based on the VVC reference software show that ACT may provide significant coding gains with negligible impact on encoding and decoding runtime.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116040429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"No-Reference Stereoscopic Image Quality Assessment Based on Convolutional Neural Network with A Long-Term Feature Fusion","authors":"Sumei Li, Mingyi Wang","doi":"10.1109/VCIP49819.2020.9301854","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301854","url":null,"abstract":"With the rapid development of three-dimensional (3D) technology, the effective stereoscopic image quality assessment (SIQA) methods are in great demand. Stereoscopic image contains depth information, making it much more challenging in exploring a reliable SIQA model that fits human visual system. In this paper, a no-reference SIQA method is proposed, which better simulates binocular fusion and binocular rivalry. The proposed method applies convolutional neural network to build a dual-channel model and achieve a long-term process of feature extraction, fusion, and processing. What’s more, both high and low frequency information are used effectively. Experimental results demonstrate that the proposed model outperforms the state-of-the-art no-reference SIQA methods and has a promising generalization ability.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127543419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Discrete Cosine Model of Light Field Sampling for Improving Rendering Quality of Views","authors":"Ying Wei, Changjian Zhu, You Yang, Yan Liu","doi":"10.1109/VCIP49819.2020.9301838","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301838","url":null,"abstract":"A number of theories have been proposed for reducing sampling rate of light field. But these theories still need a great many of samples (images) to obtain sufficient geometric information. In this paper, we utilize the sparse representation of light field in Discrete Cosine Transform domain to present a Discrete Cosine Sparse Basis (DCSB). Thus, we can find out the zeros of DCSB to reduce sampling requirement of light field for alias-free rendering. Finally, experimental results demonstrate the effectiveness of our approach without lose information.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122007579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Marked Point Process Model For Visual Perceptual Groups Extraction","authors":"A. Mbarki, M. Naouai","doi":"10.1109/VCIP49819.2020.9301776","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301776","url":null,"abstract":"Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image features tend to be grouped by giving a set of organizing principles. In this paper, we propose an approach for the detection of perceptual groups in an image. We are mainly interested in features grouped by the proximity law of Gestalt. We conceive an object-based model within a stochastic framework using a marked point process (MPP). We use a Bayesian learning method to extract perceptual groups in a scene. The proposed model tested on synthetic images proves the efficient detection of perceptual groups in noisy images.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126593524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two recent advances on normalization methods for deep neural network optimization","authors":"Lei Zhang","doi":"10.1109/VCIP49819.2020.9301751","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301751","url":null,"abstract":"The normalization methods are very important for the effective and efficient optimization of deep neural networks (DNNs). The statistics such as mean and variance can be used to normalize the network activations or weights to make the training process more stable. Among the activation normalization techniques, batch normalization (BN) is the most popular one. However, BN has poor performance when the batch size is small in training. We found that the formulation of BN in the inference stage is problematic, and consequently presented a corrected one. Without any change in the training stage, the corrected BN significantly improves the inference performance when training with small batch size.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125151846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Xie, Jianqing Zhu, Huanqiang Zeng, C. Cai, Lixin Zheng
{"title":"Learning Matching Behavior Differences for Compressing Vehicle Re-identification Models","authors":"Yi Xie, Jianqing Zhu, Huanqiang Zeng, C. Cai, Lixin Zheng","doi":"10.1109/VCIP49819.2020.9301869","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301869","url":null,"abstract":"Vehicle re-identification matching vehicles captured by different cameras has great potential in the field of public security. However, recent vehicle re-identification approaches exploit complex networks, causing large computations in their testing phases. In this paper, we propose a matching behavior difference learning (MBDL) method to compress vehicle re-identification models for saving testing computations. In order to represent the matching behavior evolution across two different layers of a deep network, a matching behavior difference (MBD) matrix is designed. Then, our MBDL method minimizes the L1 loss function among MBD matrixes from a small student network and a complex teacher network, ensuring the student network use less computations to simulate the teacher network’s matching behaviors. During the testing phase, only the small student network is utilized so that testing computations can be significantly reduced. Experiments on VeRi776 and VehicleID datasets show that MBDL outperforms many state-of-the-art approaches in terms of accuracy and testing time performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124279302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}