UPT-Flow:用于低照度图像增强的多尺度变压器引导归一化流程

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Lintao Xu , Changhui Hu , Yin Hu , Xiaoyuan Jing , Ziyun Cai , Xiaobo Lu
{"title":"UPT-Flow:用于低照度图像增强的多尺度变压器引导归一化流程","authors":"Lintao Xu ,&nbsp;Changhui Hu ,&nbsp;Yin Hu ,&nbsp;Xiaoyuan Jing ,&nbsp;Ziyun Cai ,&nbsp;Xiaobo Lu","doi":"10.1016/j.patcog.2024.111076","DOIUrl":null,"url":null,"abstract":"<div><div>Low-light images often suffer from information loss and RGB value degradation due to extremely low or nonuniform lighting conditions. Many existing methods primarily focus on optimizing the appearance distance between the enhanced image and the normal-light image, while neglecting the explicit modeling of information loss regions or incorrect information points in low-light images. To address this, this paper proposes an Unbalanced Points-guided multi-scale Transformer-based conditional normalizing Flow (UPT-Flow) for low-light image enhancement. We design an unbalanced point map prior based on the differences in the proportion of RGB values for each pixel in the image, which is used to modify traditional self-attention and mitigate the negative effects of areas with information distortion in the attention calculation. The Multi-Scale Transformer (MSFormer) is composed of several global-local transformer blocks, which encode rich global contextual information and local fine-grained details for conditional normalizing flow. In the invertible network of flow, we design cross-coupling conditional affine layers based on channel and spatial attention, enhancing the expressive power of a single flow step. Without bells and whistles, extensive experiments on low-light image enhancement, night traffic monitoring enhancement, low-light object detection, and nighttime image segmentation have demonstrated that our proposed method achieves state-of-the-art performance across a variety of real-world scenes. The code and pre-trained models will be available at <span><span>https://github.com/NJUPT-IPR-XuLintao/UPT-Flow</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"158 ","pages":"Article 111076"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UPT-Flow: Multi-scale transformer-guided normalizing flow for low-light image enhancement\",\"authors\":\"Lintao Xu ,&nbsp;Changhui Hu ,&nbsp;Yin Hu ,&nbsp;Xiaoyuan Jing ,&nbsp;Ziyun Cai ,&nbsp;Xiaobo Lu\",\"doi\":\"10.1016/j.patcog.2024.111076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Low-light images often suffer from information loss and RGB value degradation due to extremely low or nonuniform lighting conditions. Many existing methods primarily focus on optimizing the appearance distance between the enhanced image and the normal-light image, while neglecting the explicit modeling of information loss regions or incorrect information points in low-light images. To address this, this paper proposes an Unbalanced Points-guided multi-scale Transformer-based conditional normalizing Flow (UPT-Flow) for low-light image enhancement. We design an unbalanced point map prior based on the differences in the proportion of RGB values for each pixel in the image, which is used to modify traditional self-attention and mitigate the negative effects of areas with information distortion in the attention calculation. The Multi-Scale Transformer (MSFormer) is composed of several global-local transformer blocks, which encode rich global contextual information and local fine-grained details for conditional normalizing flow. In the invertible network of flow, we design cross-coupling conditional affine layers based on channel and spatial attention, enhancing the expressive power of a single flow step. Without bells and whistles, extensive experiments on low-light image enhancement, night traffic monitoring enhancement, low-light object detection, and nighttime image segmentation have demonstrated that our proposed method achieves state-of-the-art performance across a variety of real-world scenes. The code and pre-trained models will be available at <span><span>https://github.com/NJUPT-IPR-XuLintao/UPT-Flow</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"158 \",\"pages\":\"Article 111076\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320324008276\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008276","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

由于光照条件极低或不均匀,低照度图像通常会出现信息丢失和 RGB 值下降的问题。现有的许多方法主要侧重于优化增强图像与正常光照图像之间的外观距离,而忽视了对低照度图像中信息丢失区域或不正确信息点的明确建模。针对这一问题,本文提出了一种用于弱光图像增强的非平衡点引导的基于多尺度变换器的条件归一化流程(UPT-Flow)。我们根据图像中每个像素的 RGB 值比例差异设计了一种非平衡点图先验,用来修正传统的自我注意力,减轻信息失真区域在注意力计算中的负面影响。多尺度变换器(MSFormer)由多个全局-局部变换器块组成,编码丰富的全局上下文信息和局部细粒度细节,用于条件归一化流量。在流动的可逆网络中,我们设计了基于通道和空间注意力的交叉耦合条件仿射层,从而增强了单一流动步骤的表现力。在没有任何附加功能的情况下,我们在弱光图像增强、夜间交通监控增强、弱光物体检测和夜间图像分割等方面进行的大量实验证明,我们提出的方法在各种真实场景中都能达到最先进的性能。代码和预训练模型将发布在 https://github.com/NJUPT-IPR-XuLintao/UPT-Flow 网站上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
UPT-Flow: Multi-scale transformer-guided normalizing flow for low-light image enhancement
Low-light images often suffer from information loss and RGB value degradation due to extremely low or nonuniform lighting conditions. Many existing methods primarily focus on optimizing the appearance distance between the enhanced image and the normal-light image, while neglecting the explicit modeling of information loss regions or incorrect information points in low-light images. To address this, this paper proposes an Unbalanced Points-guided multi-scale Transformer-based conditional normalizing Flow (UPT-Flow) for low-light image enhancement. We design an unbalanced point map prior based on the differences in the proportion of RGB values for each pixel in the image, which is used to modify traditional self-attention and mitigate the negative effects of areas with information distortion in the attention calculation. The Multi-Scale Transformer (MSFormer) is composed of several global-local transformer blocks, which encode rich global contextual information and local fine-grained details for conditional normalizing flow. In the invertible network of flow, we design cross-coupling conditional affine layers based on channel and spatial attention, enhancing the expressive power of a single flow step. Without bells and whistles, extensive experiments on low-light image enhancement, night traffic monitoring enhancement, low-light object detection, and nighttime image segmentation have demonstrated that our proposed method achieves state-of-the-art performance across a variety of real-world scenes. The code and pre-trained models will be available at https://github.com/NJUPT-IPR-XuLintao/UPT-Flow.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信