Yufei Tan, Kan Chang, Hengxin Li, Zhenhua Tang, Tuanfa Qin
{"title":"Lightweight Color Image Demosaicking with Multi-Core Feature Extraction","authors":"Yufei Tan, Kan Chang, Hengxin Li, Zhenhua Tang, Tuanfa Qin","doi":"10.1109/VCIP49819.2020.9301841","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301841","url":null,"abstract":"Convolutional neural network (CNN)-based color image demosaicking methods have achieved great success recently. However, in many applications where the computation resource is highly limited, it is not practical to deploy large-scale networks. This paper proposes a lightweight CNN for color image demosaicking. Firstly, to effectively extract shallow features, a multi-core feature extraction module, which takes the Bayer sampling positions into consideration, is proposed. Secondly, by taking advantage of inter-channel correlation, an attention-aware fusion module is presented to efficiently r econstruct t he full color image. Moreover, a feature enhancement module, which contains several cascading attention-aware enhancement blocks, is designed to further refine t he i nitial reconstructed i mage. To demonstrate the effectiveness of the proposed network, several state-of-the-art demosaicking methods are compared. Experimental results show that with the smallest number of parameters, the proposed network outperforms the other compared methods in terms of both objective and subjective qualities.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123443993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Q. Wu, Li Chen, K. Ngan, Hongliang Li, Fanman Meng, Linfeng Xu
{"title":"A Unified Single Image De-raining Model via Region Adaptive Coupled Network","authors":"Q. Wu, Li Chen, K. Ngan, Hongliang Li, Fanman Meng, Linfeng Xu","doi":"10.1109/VCIP49819.2020.9301865","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301865","url":null,"abstract":"Single image de-raining is quite challenging due to the diversity of rain types and inhomogeneous distributions of rainwater. By means of dedicated models and constraints, existing methods perform well for specific rain type. However, their generalization capability is highly limited as well. In this paper, we propose a unified de-raining model by selectively fusing the clean background of the input rain image and the well restored regions occluded by various rains. This is achieved by our region adaptive coupled network (RACN), whose two branches integrate the features of each other in different layers to jointly generate the spatial-variant weight and restored image respectively. On the one hand, the weight branch could lead the restoration branch to focus on the regions with higher contributions for de-raining. On the other hand, the restoration branch could guide the weight branch to keep off the regions with over-/under-filtering risks. Extensive experiments show that our method outperforms many state-of-the-art de-raining algorithms on diverse rain types including the rain streak, raindrop and rain-mist.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123490851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Z. Chen, Hantao Wang, Lijun Wu, Yanlin Zhou, Dapeng Oliver Wu
{"title":"Spatiotemporal Guided Self-Supervised Depth Completion from LiDAR and Monocular Camera","authors":"Z. Chen, Hantao Wang, Lijun Wu, Yanlin Zhou, Dapeng Oliver Wu","doi":"10.1109/VCIP49819.2020.9301857","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301857","url":null,"abstract":"Depth completion aims to estimate dense depth maps from sparse depth measurements. It has become increasingly important in autonomous driving and thus has drawn wide attention. In this paper, we introduce photometric losses in both spatial and time domains to jointly guide self-supervised depth completion. This method performs an accurate end-to-end depth completion of vision tasks by using LiDAR and a monocular camera. In particular, we full utilize the consistent information inside the temporally adjacent frames and the stereo vision to improve the accuracy of depth completion in the model training phase. We design a self-supervised framework to eliminate the negative effects of moving objects and the region with smooth gradients. Experiments are conducted on KITTI. Results indicate that our self-supervised method can attain competitive performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128686063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nelson Chong Ngee Bow, Vu-Hoang Tran, Punchok Kerdsiri, Y. P. Loh, Ching-Chun Huang
{"title":"DEN: Disentanglement and Enhancement Networks for Low Illumination Images","authors":"Nelson Chong Ngee Bow, Vu-Hoang Tran, Punchok Kerdsiri, Y. P. Loh, Ching-Chun Huang","doi":"10.1109/VCIP49819.2020.9301830","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301830","url":null,"abstract":"Though learning-based low-light enhancement methods have achieved significant success, existing methods are still sensitive to noise and unnatural appearance. The problems may come from the lack of structural awareness and the confusion between noise and texture. Thus, we present a lowlight image enhancement method that consists of an image disentanglement network and an illumination boosting network. The disentanglement network is first used to decompose the input image into image details and image illumination. The extracted illumination part then goes through a multi-branch enhancement network designed to improve the dynamic range of the image. The multi-branch network extracts multi-level image features and enhances them via numerous subnets. These enhanced features are then fused to generate the enhanced illumination part. Finally, the denoised image details and the enhanced illumination are entangled to produce the normallight image. Experimental results show that our method can produce visually pleasing images in many public datasets.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126428146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fei Han, Jin Wang, Ruiqin Xiong, Qing Zhu, Baocai Yin
{"title":"HDR Image Compression with Convolutional Autoencoder","authors":"Fei Han, Jin Wang, Ruiqin Xiong, Qing Zhu, Baocai Yin","doi":"10.1109/VCIP49819.2020.9301853","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301853","url":null,"abstract":"As one of the next-generation multimedia technology, high dynamic range (HDR) imaging technology has been widely applied. Due to its wider color range, HDR image brings greater compression and storage burden compared with traditional LDR image. To solve this problem, in this paper, a two-layer HDR image compression framework based on convolutional neural networks is proposed. The framework is composed of a base layer which provides backward compatibility with the standard JPEG, and an extension layer based on a convolutional variational autoencoder neural networks and a post-processing module. The autoencoder mainly includes a nonlinear transform encoder, a binarized quantizer and a nonlinear transform decoder. Compared with traditional codecs, the proposed CNN autoencoder is more flexible and can retain more image semantic information, which will improve the quality of decoded HDR image. Moreover, to reduce the compression artifacts and noise of reconstructed HDR image, a post-processing method based on group convolutional neural networks is designed. Experimental results show that our method outperforms JPEG XT profile A, B, C and other methods in terms of HDR-VDP-2 evaluation metric. Meanwhile, our scheme also provides backward compatibility with the standard JPEG.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116808239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GRNet: Deep Convolutional Neural Networks based on Graph Reasoning for Semantic Segmentation","authors":"Yang Wu, A. Jiang, Yibin Tang, H. Kwan","doi":"10.1109/VCIP49819.2020.9301851","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301851","url":null,"abstract":"In this paper, we develop a novel deep-network architecture for semantic segmentation. In contrast to previous work that widely uses dilated convolutions, we employ the original ResNet as the backbone, and a multi-scale feature fusion module (MFFM) is introduced to extract long-range contextual information and upsample feature maps. Then, a graph reasoning module (GRM) based on graph-convolutional network (GCN) is developed to aggregate semantic information. Our graph reasoning network (GRNet) extracts global contexts of input features by modeling graph reasoning in a single framework. Experimental results demonstrate that our approach provides substantial benefits over a strong baseline and achieves superior segmentation performance on two benchmark datasets.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114771623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Random-access-aware Light Field Video Coding using Tree Pruning Method","authors":"T. N. Huu, V. V. Duong, B. Jeon","doi":"10.1109/VCIP49819.2020.9301800","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301800","url":null,"abstract":"The increasing prevalence of VR/AR as well as the expected availability of Light Field (LF) display soon call for more practical methods to transmit LF image/video for services. In that aspect, the LF video coding should not only consider the compression efficiency but also the view random-access capability (especially in the multi-view-based system). The multi-view coding system heavily exploits view dependencies coming from both inter-view and temporal correlation. While such a system greatly improves the compression efficiency, its view random-access capability can be much reduced due to so called \"chain of dependencies.\" In this paper, we first model the chain of dependencies by a tree, then a cost function is used to assign an importance value to each tree node. By travelling from top to bottom, a node of lesser importance is cut-off, forming a pruned tree to achieve reduction of random-access complexity. Our tree pruning method has shown to reduce about 40% of random-access complexity at the cost of minor compression loss compared to the state-of-the-art methods. Furthermore, it is expected that our method is very lightweight in its realization and also effective on a practical LF video coding system.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117140852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hui-jun Tang, R. T. Hsung, W. Y. Lam, Leo Y. Y. Cheng, E. Pow
{"title":"On 2D-3D Image Feature Detections for Image-To-Geometry Registration in Virtual Dental Model","authors":"Hui-jun Tang, R. T. Hsung, W. Y. Lam, Leo Y. Y. Cheng, E. Pow","doi":"10.1109/VCIP49819.2020.9301774","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301774","url":null,"abstract":"3D digital smile design (DSD) gains great interest in dentistry because it enables esthetic design of teeth and gum. However, the color texture of teeth and gum is often lost/distorted in the digitization process. Recently, the image-to-geometry registration shade mapping (IGRSM) method was proposed for registering color texture from 2D photography to 3D mesh model. It allows better control of illumination and color calibration for automatic teeth shade matching. In this paper, we investigate automated techniques to find the correspondences between 3D tooth model and color intraoral photographs for accurately perform the IGRSM. We propose to use the tooth cusp tips as the correspondence points for the IGR because they can be reliably detected both in 2D photography and 3D surface scan. A modified gradient descent method with directional priority (GDDP) and region growing are developed to find the 3D correspondence points. For the 2D image, the tooth tips contour lines are extracted based on luminosity and chromaticity, the contour peaks are then detected as the correspondence points. From the experimental results, the proposed method shows excellent accuracy in detecting the correspondence points between 2D photography and 3D tooth model. The average registration error is less than 15 pixels for 4752×3168 size intraoral image.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121720311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A semantic labeling framework for ALS point clouds based on discretization and CNN","authors":"Xingtao Wang, Xiaopeng Fan, Debin Zhao","doi":"10.1109/VCIP49819.2020.9301759","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301759","url":null,"abstract":"The airborne laser scanning (ALS) point cloud has drawn increasing attention thanks to its capability to quickly acquire large-scale and high-precision ground information. Due to the complexity of observed scenes and the irregularity of point distribution, the semantic labeling of ALS point clouds is extremely challenging. In this paper, we introduce an efficient discretization based framework according to the geometric character of ALS point clouds, and propose an original intraclass weighted cross entropy loss function to solve the problem of data imbalance. We evaluate our framework on the ISPRS (International Society for Photogrammetry and Remote Sensing) 3D Semantic Labeling dataset. The experimental results show that the proposed method has achieved a new state-of-the-art performance in terms of overall accuracy (85.3%) and average F1 score (74.1%).","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127038386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teli Ma, Yizhi Wang, Jinxin Shao, Baochang Zhang, D. Doermann
{"title":"Orthogonal Features Fusion Network for Anomaly Detection","authors":"Teli Ma, Yizhi Wang, Jinxin Shao, Baochang Zhang, D. Doermann","doi":"10.1109/VCIP49819.2020.9301755","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301755","url":null,"abstract":"Generative models have been successfully used for anomaly detection, which however need a large number of parameters and computation overheads, especially when training spatial and temporal networks in the same framework. In this paper, we introduce a novel network architecture, Orthogonal Features Fusion Network (OFF-Net), to solve the anomaly detection problem. We show that the convolutional feature maps used for generating future frames are orthogonal with each other, which can improve representation capacity of generative models and strengthen temporal connections between adjacent images. We lead a simple but effective module easily mounted on convolutional neural networks (CNNs) with negligible additional parameters added, which can replace the widely-used optical flow n etwork a nd s ignificantly im prove th e pe rformance for anomaly detection. Extensive experiment results demonstrate the effectiveness of OFF-Net that we outperform the state-of-the-art model 1.7% in terms of AUC. We save around 85M-space parameters compared with the prevailing prior arts using optical flow n etwork w ithout c omprising t he performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129733259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}