Mask-Q attention network for flare removal

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-03-28 DOI:10.1016/j.neucom.2025.130100

Zihao Li , Junming Feng , Siyao Hao , Yuze Wang , Weibang Bai

{"title":"Mask-Q attention network for flare removal","authors":"Zihao Li , Junming Feng , Siyao Hao , Yuze Wang , Weibang Bai","doi":"10.1016/j.neucom.2025.130100","DOIUrl":null,"url":null,"abstract":"<div><div>Lens flare is a common optical phenomenon that is typically undesirable because it significantly degrades image quality, thus affecting certain visual tasks. The existing main methods using CNN-based models and transformer-based models aimed at removing lens flare, however, perform poorly in removing large-scale flares because they lack the inductive bias for spatial equivariance and the ability to capture both global and local features simultaneously. This deficiency prevents networks from quickly pinpointing the locations of flares and from fully learning the contextual content necessary for repairing contaminated areas, and they also struggle with restoring uncontaminated regions. In this paper, we introduce the Mask-Q Attention Network, a multi-scale framework designed to address the flare removal problem. Our approach focuses on extracting both global and local features, leveraging ResBlock for local feature extraction and Mask-Q Attention for capturing global features. In the Mask-Q Attention, we enhance the localization capability of the attention mechanism by integrating the flare mask with the query vector (Q) in the self-attention process. The flare mask is obtained by binarizing the initial image. This integration with the flare mask effectively resolves the issue of the lack of spatial equivariance in Transformer blocks, providing the network with prior knowledge of the flare locations. Extensive experiments demonstrate MQANet’s superiority, achieving minimum gains of 0.5% PSNR and 0.6% SSIM on synthetic datasets, alongside a 0.1% SSIM improvement on real-world data. It also performs well in terms of the LPIPS metric.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130100"},"PeriodicalIF":5.5000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225007726","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Lens flare is a common optical phenomenon that is typically undesirable because it significantly degrades image quality, thus affecting certain visual tasks. The existing main methods using CNN-based models and transformer-based models aimed at removing lens flare, however, perform poorly in removing large-scale flares because they lack the inductive bias for spatial equivariance and the ability to capture both global and local features simultaneously. This deficiency prevents networks from quickly pinpointing the locations of flares and from fully learning the contextual content necessary for repairing contaminated areas, and they also struggle with restoring uncontaminated regions. In this paper, we introduce the Mask-Q Attention Network, a multi-scale framework designed to address the flare removal problem. Our approach focuses on extracting both global and local features, leveraging ResBlock for local feature extraction and Mask-Q Attention for capturing global features. In the Mask-Q Attention, we enhance the localization capability of the attention mechanism by integrating the flare mask with the query vector (Q) in the self-attention process. The flare mask is obtained by binarizing the initial image. This integration with the flare mask effectively resolves the issue of the lack of spatial equivariance in Transformer blocks, providing the network with prior knowledge of the flare locations. Extensive experiments demonstrate MQANet’s superiority, achieving minimum gains of 0.5% PSNR and 0.6% SSIM on synthetic datasets, alongside a 0.1% SSIM improvement on real-world data. It also performs well in terms of the LPIPS metric.

Abstract Image

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.