Prashant W. Patil , Santosh Nagnath Randive , Sunil Gupta , Santu Rana , Svetha Venkatesh , Subrahmanyam Murala
{"title":"Unpaired recurrent learning for real-world video de-hazing","authors":"Prashant W. Patil , Santosh Nagnath Randive , Sunil Gupta , Santu Rana , Svetha Venkatesh , Subrahmanyam Murala","doi":"10.1016/j.patcog.2025.111698","DOIUrl":null,"url":null,"abstract":"<div><div>Automated outdoor vision-based applications have become increasingly in demand for day-to-day life. Bad weather like haze, rain, snow, <em>etc.</em> may limit the reliability of these applications due to degradation in the overall video quality. So, there is a dire need to pre-process the weather-degraded videos before they are fed to downstream applications. Researchers generally adopt synthetically generated paired hazy frames for learning the task of video de-hazing. The models trained solely on synthetic data may have limited performance on different types of real-world hazy scenarios due to significant domain gap between synthetic and real-world hazy videos. One possible solution is to prove the generalization ability by training on unpaired data for video de-hazing. Some unpaired learning approaches are proposed for single image de-hazing. However, these unpaired single image de-hazing approaches compromise the performance in terms of temporal consistency, which is important for video de-hazing tasks. With this motivation, we have proposed a lightweight and temporally consistent architecture for video de-hazing tasks. To achieve this, diverse receptive and multi-scale features at various input resolutions are mixed and aggregated with multi-kernel attention to extract significant haze information. Furthermore, we propose a recurrent multi-attentive feature alignment concept to maintain temporal consistency with recurrent feedback of previously restored frames for temporal consistent video restoration. Comprehensive experiments are conducted on real-world and synthetic video databases (REVIDE and RSA100Haze). Both the qualitative and quantitative results show significant improvement of the proposed network with better temporal consistency over state-of-the-art methods for detailed video restoration in hazy weather. Source code is available at: <span><span>https://github.com/pwp1208/UnpairedVideoDehazing</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111698"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325003589","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Automated outdoor vision-based applications have become increasingly in demand for day-to-day life. Bad weather like haze, rain, snow, etc. may limit the reliability of these applications due to degradation in the overall video quality. So, there is a dire need to pre-process the weather-degraded videos before they are fed to downstream applications. Researchers generally adopt synthetically generated paired hazy frames for learning the task of video de-hazing. The models trained solely on synthetic data may have limited performance on different types of real-world hazy scenarios due to significant domain gap between synthetic and real-world hazy videos. One possible solution is to prove the generalization ability by training on unpaired data for video de-hazing. Some unpaired learning approaches are proposed for single image de-hazing. However, these unpaired single image de-hazing approaches compromise the performance in terms of temporal consistency, which is important for video de-hazing tasks. With this motivation, we have proposed a lightweight and temporally consistent architecture for video de-hazing tasks. To achieve this, diverse receptive and multi-scale features at various input resolutions are mixed and aggregated with multi-kernel attention to extract significant haze information. Furthermore, we propose a recurrent multi-attentive feature alignment concept to maintain temporal consistency with recurrent feedback of previously restored frames for temporal consistent video restoration. Comprehensive experiments are conducted on real-world and synthetic video databases (REVIDE and RSA100Haze). Both the qualitative and quantitative results show significant improvement of the proposed network with better temporal consistency over state-of-the-art methods for detailed video restoration in hazy weather. Source code is available at: https://github.com/pwp1208/UnpairedVideoDehazing.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.