Kaito Abiko, Kazunori Uruma, Mamoru Sugawara, S. Hangai, T. Hamamoto
{"title":"Image Segmentation Based Graph-Cut Approach to Fast Color Image Coding via Graph Fourier Transform","authors":"Kaito Abiko, Kazunori Uruma, Mamoru Sugawara, S. Hangai, T. Hamamoto","doi":"10.1109/VCIP47243.2019.8966021","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8966021","url":null,"abstract":"Colorization-based image coding is a technique to compress chrominance information of an image using a colorization technique. The conventional algorithm applies graph Fourier transform to the colorization-based coding. In this algorithm, several pixels on the image are defined as vertices of the graph, and the chrominance values of that pixels are set as graph signals. Then, the graph signal corresponding to the several chrominance values on the image is transformed to the graph spectrum based on the graph Fourier transform, and the graph spectrum is compressed and stored. Because the stored graph spectrum gives the graph signal on the image based on the inverse graph Fourier transform in decoding phase, the color image is recovered from the luminance image and the several chrominance values corresponding to the graph signal. However, high calculation time is required to perform graph Fourier transform, and therefore, this paper proposes a fast graph Fourier transform to improve the conventional colorization-based image coding algorithm. In numerical examples, although the PSNR value is decreased 0.3 dB, the proposed algorithm is 16.8 times faster than the conventional method.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123975203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Jiao, Yi Niu, Yuting Zhang, Fu Li, Chunbo Zou, Guangming Shi
{"title":"Facial Attention based Convolutional Neural Network for 2D+3D Facial Expression Recognition","authors":"Yang Jiao, Yi Niu, Yuting Zhang, Fu Li, Chunbo Zou, Guangming Shi","doi":"10.1109/VCIP47243.2019.8965843","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965843","url":null,"abstract":"Discriminative facial parts are essential for facial expression recognition (FER) tasks because of small inter-class differences and large intra-class variations in expression images. Existing methods localize discriminative regions with the aid of extra facial landmarks, such as action units (AU). However, it consumes a lot of manpower in manually labeling. To address this problem, in this paper, we propose an advanced facial attention based convolutional neural network (FA-CNN) for 2D+3D FER. The main contribution of FA-CNN is the facial attention mechanism, which enables the network to localize the discriminative regions automatically from multi-modality expression images without dense landmark annotations. Experimental results conducted on BU-3DFE demonstrate that FA-CNN achieves state-of-the-art performance comparing with the existing 2D+3D FER techniques, and the discriminative facial parts estimated by the facial attention mechanism are highly interpretable and consistent with human perception.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128258179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stronger Baseline for Vehicle Re-Identification in the Wild","authors":"Chih-Chung Hsu, Cing-Hao Hung, Chih-Yu Jian, Yi-Xiu Zhuang","doi":"10.1109/VCIP47243.2019.8965867","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965867","url":null,"abstract":"Recently, re-identification tasks in computer vision field draw attention. Vehicle re-identification can be used to find the suspect car (target) from a vast surveillance video dataset. One of the most critical issues in the vehicle re-identification task is how to learn the effective feature representation. In general, pairwise learning such as the contrastive and triplet loss functions is adopted to learn the discriminative feature based on the convolution neural network. A good backbone network will lead to a significant improvement in the car re-identification task. In this paper, a stronger baseline method is proposed to achieve a better feature representation ability. First, we integrate the shift-invariant convolutional neural network with ResNet backbone to enhance the consistency feature learning. Afterward, a multi-layer feature fusion module is proposed to incorporate the middle- and high-level features to further improve the performance of car re-identification. Experimental results demonstrated that the proposed stronger baseline method achieves state-of-the-art performance in terms of mean averaging precision.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127175435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Macropixel-constrained Collocated Position Search for Plenoptic Video Coding","authors":"Lingjun Li, Xin Jin","doi":"10.1109/VCIP47243.2019.8965931","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965931","url":null,"abstract":"The plenoptic video recording light fields varying with time has the large-range motion and complex macropixel structure, which brings a great challenge to efficient compression. Based on the analysis of the relationship between temporal motion and macropixel arrangement, an efficient motion estimation algorithm is proposed for plenoptic video compression. It finetunes motion vector predictor (MVP) by the macropixel-constrained collocated position of the current prediction unit in the nearest macropixel, and uses all the macropixel-constrained collocated positions, derived from the finetuned MVP, as motion search candidates for better complexity and compression-efficiency trade-off. The experimental results demonstrate that the proposed algorithm outperforms multi-view based compression and the pseudovideo based compression by an average of 56.40% and 78.72% bitrate reduction, respectively. Compared with HEVC, the proposed method can also save 7.65% bitrate.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130013188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-view high dynamic range reconstruction via gain estimation","authors":"Firas Abedi, Qiong Liu, You Yang","doi":"10.1109/VCIP47243.2019.8965880","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965880","url":null,"abstract":"Multi-view high dynamic range reconstruction is a challenging problem, especially if the multi-view low dynamic range images are obtained from cameras arranged sparsely with limited shared view of vision among them. In this paper, we address the above challenge in addition to the back-lighting problem. We first enclose the geometry characteristic of the scene to rectify the outlier feature points. Consequently, an exposure gain is calculated according to those rectified features. After that, we extend the dynamic range for the multi-view low dynamic range images based on the estimated gain, then, generate a final high dynamic range image per view. Experimental results demonstrate superior performance for the proposed method over state-of-the-art methods in both objective and subject comparisons. These results suggest that our method is suitable to improve the visual quality of multi-view low dynamic range images captured in low back-lighting conditions via commercial cameras sparsely located among each other.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126736623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Gaussian Mixture Model Based Hole-filling Algorithm Exploiting Depth Information","authors":"Tiantian Zhu, Pan Gao","doi":"10.1109/VCIP47243.2019.8965964","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965964","url":null,"abstract":"Virtual views generation is of great significance in free viewpoint video (FVV) as it can avoid the need to transmit a large volume of video data. An important issue in generating virtual views is how to fill the holes caused by occlusion. Using the Gaussian mixture model (GMM) to generate the background reference image is a commonly used hole-filling method. However, GMM usually has poor performance for sequences with reciprocal motion. In this paper, we propose an improved GMM-based method. To avoid the foreground pixels misclassified as the background pixels, we use depth information to adjust the learning rate in GMM. Foreground pixel is given a smaller learning rate than the background. Further, a refined foreground depth correlation (FDC) algorithm is proposed, which generates the background frame by tracking the change of the foreground depth in the temporal direction. In contrast to existing algorithms, we use a sliding window to obtain multiple background reference frames. These reference frames are then fused together to generate a more accurate background frame. Finally, we adaptively choose the background pixel from the GMM and FDC for hole filling. The experimental results show that subjective gain can be achieved, and significant objective gain can be observed in reciprocal motion sequences.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121409061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wireless Cooperative Caching System","authors":"Chaoyu Gu, Jian Xiong, Haonan Xie, Peng Cheng","doi":"10.1109/VCIP47243.2019.8965684","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965684","url":null,"abstract":"In this paper, we introduce a wireless cooperative caching system (WCCS) to reduce the cost of both operators and users and also improve the quality of experience (QoE) of users. In this system, an intelligent routing relay (IRR) assigns a list of popular services and user terminals (UTs) can cache these services with their own cellular network traffic. When UTs upload these cached services to the IRR, they can obtain rewards. The other users can then access the IRR to get the cached services at a faster rate and get better QoE. The cellular network operators can save spectrum resource while they just deliver one copy of services. In order to encourage users to upload, we use reverse auction and first-come-first-served (FCFS) to choose the winning bids and allocate rewards to UTs. In this demo, we present displays on PC and mobile phones.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121837635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Feature Guided Image Retargeting","authors":"Jinan Wu, Rong Xie, Li Song, Bo Liu","doi":"10.1109/VCIP47243.2019.8966008","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8966008","url":null,"abstract":"Image retargeting is the technique to display images via devices with various aspect ratios and sizes. Traditional content-aware retargeting methods rely on low-level features to predict pixel-wise importance and can hardly preserve both the structure lines and salient regions of the source image. To address this problem, we propose a novel adaptive image warping approach which integrates with deep convolutional neural network. In the proposed method, a visual importance map and a foreground mask map are generated by a pre-trained network. The two maps and other constraints guide the warping process to yield retargeted results with less distortions. Extensive experiments in terms of visual quality and a user study are carried out on the widely used RetargetMe dataset. Experimental results show that our method outperforms current state-of-art image retargeting methods.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133532602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual-Streams Global Guided Learning for High Dynamic Range Image Reconstruction","authors":"Junjie Lian, Yongfang Wang, Chuang Wang","doi":"10.1109/VCIP47243.2019.8965798","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965798","url":null,"abstract":"High dynamic range (HDR) images capture the luminance information of the real world and have more detailed information than low dynamic range (LDR) images. In this paper, we propose a dual-streams global guided end-to-end learning method to reconstruct HDR image from a single LDR input that combines both global information and local image features. In our framework, global features and local features are separately learned in dual-streams branches. In the reconstructed phase, we use a fusion layer to fuse them so that the global features can guide the local features to better reconstruct the HDR image. Furthermore, we design mixed loss function including multi-scale pixel-wise loss, color similarity loss and gradient loss to jointly train our network. Comparative experiments are carried out with other state-of-the-art methods and our method achieves superior performance.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117168127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Jaiswal, V. Jakhetiya, K. Gui, Sharath Chandra Guntuku, A. Singla
{"title":"Frequency-Domain Analysis Based Exploitation Of Color Channels For Color Image Demosaicking","authors":"S. Jaiswal, V. Jakhetiya, K. Gui, Sharath Chandra Guntuku, A. Singla","doi":"10.1109/VCIP47243.2019.8966070","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8966070","url":null,"abstract":"Color-difference interpolation (CDI) has been a widely used technique for various color demosaicking methods. CDI-based methods perform interpolation in the color-difference domain assuming that the color-difference signal is a low-pass signal. Recently, a residual interpolation (RI) algorithm, which conducts interpolation in the residual domain, has been developed, and it assumes that the residual domain is flatter or smoother than the channel-difference domain. In this paper, we comprehensively show a frequency domain analysis of these assumptions and observe that it is image dependent and creates artifacts in the interpolated image. With this view, we propose an algorithm that uses the inter-color correlation as well as the residual smoothness among the different channel much better than the existing algorithms. Experimental results emphasize that the proposed algorithm atribute better performances the existing algorithms in terms of both visual and objective quality.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123450620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}