Mutual Retinex: Combining Transformer and CNN for Image Enhancement

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-03-12 DOI:10.1109/TETCI.2024.3369321

Kui Jiang;Qiong Wang;Zhaoyi An;Zheng Wang;Cong Zhang;Chia-Wen Lin

{"title":"Mutual Retinex: Combining Transformer and CNN for Image Enhancement","authors":"Kui Jiang;Qiong Wang;Zhaoyi An;Zheng Wang;Cong Zhang;Chia-Wen Lin","doi":"10.1109/TETCI.2024.3369321","DOIUrl":null,"url":null,"abstract":"Images captured in low-light or underwater environments are often accompanied by significant degradation, which can negatively impact the quality and performance of downstream tasks. While convolutional neural networks (CNNs) and Transformer architectures have made significant progress in computer vision tasks, there are few efforts to harmonize them into a more concise framework for enhancing such images. To this end, this study proposes to aggregate the individual capability of self-attention (SA) and CNNs for accurate perturbation removal while preserving background contents. Based on this, we carry forward a Retinex-based framework, dubbed as Mutual Retinex, where a two-branch structure is designed to characterize the specific knowledge of reflectance and illumination components while removing the perturbation. To maximize its potential, Mutual Retinex is equipped with a new mutual learning mechanism, involving an elaborately designed mutual representation module (MRM). In MRM, the complementary information between reflectance and illumination components are encoded and used to refine each other. Through the complementary learning via the mutual representation, the enhanced results generated by our model exhibit superior color consistency and naturalness. Extensive experiments have shown the significant superiority of our mutual learning based method over thirteen competitors on the low-light task and ten methods on the underwater image enhancement task. In particular, our proposed Mutual Retinex respectively surpasses the state-of-the-art method MIRNet-v2 by 0.90 dB and 2.46 dB in PSNR on the LOL 1000 and FIVEK datasets, while with only 19.8% model parameters.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 3","pages":"2240-2252"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10462575/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Images captured in low-light or underwater environments are often accompanied by significant degradation, which can negatively impact the quality and performance of downstream tasks. While convolutional neural networks (CNNs) and Transformer architectures have made significant progress in computer vision tasks, there are few efforts to harmonize them into a more concise framework for enhancing such images. To this end, this study proposes to aggregate the individual capability of self-attention (SA) and CNNs for accurate perturbation removal while preserving background contents. Based on this, we carry forward a Retinex-based framework, dubbed as Mutual Retinex, where a two-branch structure is designed to characterize the specific knowledge of reflectance and illumination components while removing the perturbation. To maximize its potential, Mutual Retinex is equipped with a new mutual learning mechanism, involving an elaborately designed mutual representation module (MRM). In MRM, the complementary information between reflectance and illumination components are encoded and used to refine each other. Through the complementary learning via the mutual representation, the enhanced results generated by our model exhibit superior color consistency and naturalness. Extensive experiments have shown the significant superiority of our mutual learning based method over thirteen competitors on the low-light task and ten methods on the underwater image enhancement task. In particular, our proposed Mutual Retinex respectively surpasses the state-of-the-art method MIRNet-v2 by 0.90 dB and 2.46 dB in PSNR on the LOL 1000 and FIVEK datasets, while with only 19.8% model parameters.

查看原文本刊更多论文

Mutual Retinex：结合变换器和 CNN 增强图像效果

在弱光或水下环境中捕捉到的图像通常会出现明显的劣化，这会对下游任务的质量和性能产生负面影响。虽然卷积神经网络（CNN）和变换器架构在计算机视觉任务中取得了重大进展，但很少有人努力将它们协调到一个更简洁的框架中，以增强此类图像的效果。为此，本研究提出将自我注意（SA）和 CNN 的各自能力结合起来，在保留背景内容的同时准确去除扰动。在此基础上，我们提出了一个基于 Retinex 的框架，称为 Mutual Retinex，其中设计了一个双分支结构，用于在去除扰动的同时表征反射和光照成分的特定知识。为了最大限度地发挥其潜力，Mutual Retinex 配备了一种新的相互学习机制，其中包括一个精心设计的相互表示模块（MRM）。在 MRM 中，反射和光照组件之间的互补信息被编码并用于完善彼此。通过相互表征的互补学习，我们的模型生成的增强结果表现出卓越的色彩一致性和自然度。大量实验表明，在低照度任务和水下图像增强任务中，我们基于相互学习的方法分别优于 13 种和 10 种竞争方法。特别是，我们提出的 Mutual Retinex 在 LOL 1000 和 FIVEK 数据集上的 PSNR 分别超过了最先进方法 MIRNet-v2 0.90 dB 和 2.46 dB，而模型参数只占 19.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.