{"title":"HCTMIF: Hybrid CNN-Transformer Multi Information Fusion Network for Low Light Image Enhancement","authors":"Han Wang, Hengshuai Cui, Jinjiang Li, Zhen Hua","doi":"10.1049/ipr2.70127","DOIUrl":null,"url":null,"abstract":"<p>Images captured with poor hardware and insufficient light sources suffer from visual degradation such as low visibility, strong noise, and color casts. Low-light image enhancement methods focus on solving the problem of brightness in dark areas while eliminating the degradation of low-light images. To solve the above problems, we proposed a hybrid CNN-transformer multi information fusion network (HCTMIF) for low-light image enhancement. In this paper, the proposed network architecture is divided into three stages to progressively improve the degraded features of low-light images using the divide-and-conquer principle. First, both the first stage and the second stage adopt the encoder–decoder architecture composed of transformer and CNN to improve the long-distance modeling and local feature extraction capabilities of the network. We add a visual enhancement module (VEM) to the encoding block to further strengthen the network's ability to learn global and local information. In addition, the multi-information fusion block (MIFB) is used to complement the feature maps corresponding to the same scale of the coding block and decoding block of each layer. Second, to improve the mobility of useful information across stages, we designed the self-supervised module (SSM) to readjust the weight parameters to enhance the characterization of local features. Finally, to retain the spatial details of the enhanced images more precisely, we design the detail supplement unit (DSU) to enrich the saturation of the enhanced images. After qualitative and quantitative analyses on multiple benchmark datasets, our method outperforms other methods in terms of visual effects and metric scores.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70127","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70127","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Images captured with poor hardware and insufficient light sources suffer from visual degradation such as low visibility, strong noise, and color casts. Low-light image enhancement methods focus on solving the problem of brightness in dark areas while eliminating the degradation of low-light images. To solve the above problems, we proposed a hybrid CNN-transformer multi information fusion network (HCTMIF) for low-light image enhancement. In this paper, the proposed network architecture is divided into three stages to progressively improve the degraded features of low-light images using the divide-and-conquer principle. First, both the first stage and the second stage adopt the encoder–decoder architecture composed of transformer and CNN to improve the long-distance modeling and local feature extraction capabilities of the network. We add a visual enhancement module (VEM) to the encoding block to further strengthen the network's ability to learn global and local information. In addition, the multi-information fusion block (MIFB) is used to complement the feature maps corresponding to the same scale of the coding block and decoding block of each layer. Second, to improve the mobility of useful information across stages, we designed the self-supervised module (SSM) to readjust the weight parameters to enhance the characterization of local features. Finally, to retain the spatial details of the enhanced images more precisely, we design the detail supplement unit (DSU) to enrich the saturation of the enhanced images. After qualitative and quantitative analyses on multiple benchmark datasets, our method outperforms other methods in terms of visual effects and metric scores.
期刊介绍:
The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications.
Principal topics include:
Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality.
Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing.
Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing.
Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video.
Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography.
Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security.
Current Special Issue Call for Papers:
Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf
AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf
Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf
Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf