Leveraging Content and Context Cues for Low-Light Image Enhancement

IF 9.7 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2025-02-21 DOI:10.1109/TMM.2025.3543047

Igor Morawski;Kai He;Shusil Dangi;Winston H. Hsu

{"title":"Leveraging Content and Context Cues for Low-Light Image Enhancement","authors":"Igor Morawski;Kai He;Shusil Dangi;Winston H. Hsu","doi":"10.1109/TMM.2025.3543047","DOIUrl":null,"url":null,"abstract":"Low-light conditions have an adverse impact on machine cognition, limiting the performance of computer vision systems in real life. Since low-light data is limited and difficult to annotate, we focus on image processing to enhance low-light images and improve the performance of any downstream task model, instead of fine-tuning each of the models which can be prohibitively expensive. We propose to improve the existing zero-reference low-light enhancement by leveraging the CLIP model to capture image prior and for semantic guidance. Specifically, we propose a data augmentation strategy to learn an image prior via prompt learning, based on image sampling, to learn the image prior without any need for paired or unpaired normal-light data. Next, we propose a semantic guidance strategy that maximally takes advantage of existing low-light annotation by introducing both content and context cues about the image training patches. We experimentally show, in a qualitative study, that the proposed prior and semantic guidance help to improve the overall image contrast and hue, as well as improve background-foreground discrimination, resulting in reduced over-saturation and noise over-amplification, common in related zero-reference methods. As we target machine cognition, rather than rely on assuming the correlation between human perception and downstream task performance, we conduct and present an ablation study and comparison with related zero-reference methods in terms of task-based performance across many low-light datasets, including image classification, object and face detection, showing the effectiveness of our proposed method.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"5337-5351"},"PeriodicalIF":9.7000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10897879/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Low-light conditions have an adverse impact on machine cognition, limiting the performance of computer vision systems in real life. Since low-light data is limited and difficult to annotate, we focus on image processing to enhance low-light images and improve the performance of any downstream task model, instead of fine-tuning each of the models which can be prohibitively expensive. We propose to improve the existing zero-reference low-light enhancement by leveraging the CLIP model to capture image prior and for semantic guidance. Specifically, we propose a data augmentation strategy to learn an image prior via prompt learning, based on image sampling, to learn the image prior without any need for paired or unpaired normal-light data. Next, we propose a semantic guidance strategy that maximally takes advantage of existing low-light annotation by introducing both content and context cues about the image training patches. We experimentally show, in a qualitative study, that the proposed prior and semantic guidance help to improve the overall image contrast and hue, as well as improve background-foreground discrimination, resulting in reduced over-saturation and noise over-amplification, common in related zero-reference methods. As we target machine cognition, rather than rely on assuming the correlation between human perception and downstream task performance, we conduct and present an ablation study and comparison with related zero-reference methods in terms of task-based performance across many low-light datasets, including image classification, object and face detection, showing the effectiveness of our proposed method.

查看原文本刊更多论文

利用内容和上下文线索的低光图像增强

低光条件对机器认知有不利影响，限制了计算机视觉系统在现实生活中的性能。由于低光数据有限且难以注释，我们专注于图像处理以增强低光图像并提高任何下游任务模型的性能，而不是对每个模型进行微调，这可能会非常昂贵。我们提出利用CLIP模型对现有的零参考低光增强进行先验图像捕获和语义引导。具体来说，我们提出了一种数据增强策略，通过基于图像采样的快速学习来学习图像先验，从而在不需要配对或未配对的正光数据的情况下学习图像先验。接下来，我们提出了一种语义引导策略，通过引入关于图像训练补丁的内容和上下文线索，最大限度地利用现有的低光注释。在定性研究中，我们通过实验表明，所提出的先验和语义引导有助于提高图像的整体对比度和色调，并改善背景前景区分，从而减少相关零参考方法中常见的过饱和度和噪声过放大。由于我们的目标是机器认知，而不是依赖于假设人类感知与下游任务性能之间的相关性，我们在许多低光照数据集（包括图像分类、物体和人脸检测）上进行了消融研究，并与相关的零参考方法进行了比较，显示了我们提出的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.