{"title":"Face Reflection Removal Network Using Multispectral Fusion of RGB and NIR Images","authors":"Hui Lan;Enquan Zhang;Cheolkon Jung","doi":"10.1109/OJSP.2024.3351472","DOIUrl":null,"url":null,"abstract":"Images captured through glass are usually contaminated by reflections, and the removal of them from images is a challenging task. Since the primary concern on photos is face, the face images with reflections annoy viewers severely. In this article, we propose a face reflection removal network using multispectral fusion of color (RGB) and near infrared (NIR) images, called FRRN. Due to the different spectral wavelengths of visible light [380 nm, 780 nm] and near infrared [780 nm, 2526 nm], NIR cameras are not sensitive to the visible light and thus NIR images are less corrupted by reflections. NIR images preserve structure information well and can guide the restoration process from reflections on the RGB images. Thus, we adopt multispectual fusion of RGB and NIR images for reflection removal from a face image. FRRN consists of one encoder model (contextual encoder model (CEM)) and two decoder models (NIR inference decoder model (NIDM) and image inference decoder model (IIDM)). CEM captures features from shallow to deep layers on the scene information while suppressing the sparse reflection component. NIDM infers NIR image to facilitate multi-scale guidance for reflection removal, while IIDM estimates the transmission layer with the guidance of NIDM. Besides, we present the reflection confidence generation module (RCGM) based on Laplacian convolution and channel attention-based residual block (CARB) to represent the reflection confidence in a region for reflection removal. To train FRRN, we construct a large-scale training dataset with face image and reflection layer (RGB and NIR images) and its corresponding test dataset using JAI AD-130 GE camera. Various experiments demonstrate that FRRN outperforms state-of-the-art methods for reflection removal in terms of visual quality and quantitative measurements.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"383-392"},"PeriodicalIF":2.9000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10384724","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of signal processing","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10384724/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Images captured through glass are usually contaminated by reflections, and the removal of them from images is a challenging task. Since the primary concern on photos is face, the face images with reflections annoy viewers severely. In this article, we propose a face reflection removal network using multispectral fusion of color (RGB) and near infrared (NIR) images, called FRRN. Due to the different spectral wavelengths of visible light [380 nm, 780 nm] and near infrared [780 nm, 2526 nm], NIR cameras are not sensitive to the visible light and thus NIR images are less corrupted by reflections. NIR images preserve structure information well and can guide the restoration process from reflections on the RGB images. Thus, we adopt multispectual fusion of RGB and NIR images for reflection removal from a face image. FRRN consists of one encoder model (contextual encoder model (CEM)) and two decoder models (NIR inference decoder model (NIDM) and image inference decoder model (IIDM)). CEM captures features from shallow to deep layers on the scene information while suppressing the sparse reflection component. NIDM infers NIR image to facilitate multi-scale guidance for reflection removal, while IIDM estimates the transmission layer with the guidance of NIDM. Besides, we present the reflection confidence generation module (RCGM) based on Laplacian convolution and channel attention-based residual block (CARB) to represent the reflection confidence in a region for reflection removal. To train FRRN, we construct a large-scale training dataset with face image and reflection layer (RGB and NIR images) and its corresponding test dataset using JAI AD-130 GE camera. Various experiments demonstrate that FRRN outperforms state-of-the-art methods for reflection removal in terms of visual quality and quantitative measurements.