{"title":"Specular highlight removal using Quaternion transformer","authors":"The Van Le, Jin Young Lee","doi":"10.1016/j.cviu.2024.104179","DOIUrl":null,"url":null,"abstract":"<div><div>Specular highlight removal is a very important issue, because specular highlight reflections in images with illumination changes can give very negative effects on various computer vision and image processing tasks. Numerous state-of-the-art networks for the specular removal use convolutional neural networks (CNN), which cannot learn global context effectively. They capture spatial information while overlooking 3D structural correlation information of an RGB image. To address this problem, we introduce a specular highlight removal network based on Quaternion transformer (QformerSHR), which employs a transformer architecture based on Quaternion representation. In particular, a depth-wise separable Quaternion convolutional layer (DSQConv) is proposed to enhance computational performance of QformerSHR, while efficiently preserving the structural correlation of an RGB image by utilizing the Quaternion representation. In addition, a Quaternion transformer block (QTB) based on DSQConv learns global context. As a result, QformerSHR consisting of DSQConv and QTB performs the specular removal from natural and text image datasets effectively. Experimental results demonstrate that it is significantly more effective than state-of-the-art networks for the specular removal, in terms of both quantitative performance and subjective quality.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"249 ","pages":"Article 104179"},"PeriodicalIF":4.3000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224002601","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Specular highlight removal is a very important issue, because specular highlight reflections in images with illumination changes can give very negative effects on various computer vision and image processing tasks. Numerous state-of-the-art networks for the specular removal use convolutional neural networks (CNN), which cannot learn global context effectively. They capture spatial information while overlooking 3D structural correlation information of an RGB image. To address this problem, we introduce a specular highlight removal network based on Quaternion transformer (QformerSHR), which employs a transformer architecture based on Quaternion representation. In particular, a depth-wise separable Quaternion convolutional layer (DSQConv) is proposed to enhance computational performance of QformerSHR, while efficiently preserving the structural correlation of an RGB image by utilizing the Quaternion representation. In addition, a Quaternion transformer block (QTB) based on DSQConv learns global context. As a result, QformerSHR consisting of DSQConv and QTB performs the specular removal from natural and text image datasets effectively. Experimental results demonstrate that it is significantly more effective than state-of-the-art networks for the specular removal, in terms of both quantitative performance and subjective quality.
期刊介绍:
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views.
Research Areas Include:
• Theory
• Early vision
• Data structures and representations
• Shape
• Range
• Motion
• Matching and recognition
• Architecture and languages
• Vision systems