Synergistic dual and efficient additive attention network for No-Reference Image Quality Assessment

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Vision and Image Understanding Pub Date : 2025-09-27 DOI:10.1016/j.cviu.2025.104516

Zhou Fang, Baiming Feng, Ning Li

{"title":"Synergistic dual and efficient additive attention network for No-Reference Image Quality Assessment","authors":"Zhou Fang, Baiming Feng, Ning Li","doi":"10.1016/j.cviu.2025.104516","DOIUrl":null,"url":null,"abstract":"<div><div>No-Reference Image Quality Assessment (NR-IQA) aims to evaluate the perceptual quality of images in alignment with human subjective judgments. However, most existing NR-IQA methods, while striving for high accuracy, often neglect computational complexity. To address this challenge, we propose a Synergistic Spatial and Channel and Efficient Additive Attention Network for NR-IQA. In our approach, we first employ a feature extraction module to derive features rich in both distortion and semantic information. Subsequently, we introduce a spatial-channel synergistic attention mechanism to enhance feature representations across spatial and channel dimensions. This attention module focuses on the most salient regions of the image and modulates feature responses accordingly, enabling the network to emphasize critical distortions and semantic features pertinent to perceptual quality assessment. Specifically, the spatial attention mechanism identifies significant regions that substantially contribute to quality perception, while the channel attention mechanism adjusts the importance of each feature channel, ensuring effective utilization of spatial and channel-specific information. Furthermore, to enhance the model’s robustness, we incorporate an Efficient Additive Attention mechanism alongside a Multi-scale Feed-forward Network, designed to reduce computational costs without compromising performance. Finally, a dual-branch structure for patch-weighted quality prediction is employed to derive the final quality score based on the weighted scores of individual patches. Extensive experimental evaluations on four widely used benchmark datasets demonstrate that the proposed method surpasses several state-of-the-art NR-IQA approaches in both performance and computational efficiency.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"261 ","pages":"Article 104516"},"PeriodicalIF":3.5000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225002395","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

No-Reference Image Quality Assessment (NR-IQA) aims to evaluate the perceptual quality of images in alignment with human subjective judgments. However, most existing NR-IQA methods, while striving for high accuracy, often neglect computational complexity. To address this challenge, we propose a Synergistic Spatial and Channel and Efficient Additive Attention Network for NR-IQA. In our approach, we first employ a feature extraction module to derive features rich in both distortion and semantic information. Subsequently, we introduce a spatial-channel synergistic attention mechanism to enhance feature representations across spatial and channel dimensions. This attention module focuses on the most salient regions of the image and modulates feature responses accordingly, enabling the network to emphasize critical distortions and semantic features pertinent to perceptual quality assessment. Specifically, the spatial attention mechanism identifies significant regions that substantially contribute to quality perception, while the channel attention mechanism adjusts the importance of each feature channel, ensuring effective utilization of spatial and channel-specific information. Furthermore, to enhance the model’s robustness, we incorporate an Efficient Additive Attention mechanism alongside a Multi-scale Feed-forward Network, designed to reduce computational costs without compromising performance. Finally, a dual-branch structure for patch-weighted quality prediction is employed to derive the final quality score based on the weighted scores of individual patches. Extensive experimental evaluations on four widely used benchmark datasets demonstrate that the proposed method surpasses several state-of-the-art NR-IQA approaches in both performance and computational efficiency.

查看原文本刊更多论文

无参考图像质量评价的协同双高效加性关注网络

无参考图像质量评估（NR-IQA）旨在根据人类的主观判断来评估图像的感知质量。然而，现有的大多数NR-IQA方法在追求高精度的同时，往往忽略了计算复杂度。为了应对这一挑战，我们提出了一个用于NR-IQA的协同空间和通道以及高效加性注意力网络。在我们的方法中，我们首先使用特征提取模块来获得富含扭曲和语义信息的特征。随后，我们引入了一个空间-通道协同注意机制来增强跨空间和通道维度的特征表征。该注意力模块专注于图像中最显著的区域，并相应地调节特征响应，使网络能够强调与感知质量评估相关的关键扭曲和语义特征。具体而言，空间注意机制确定了对质量感知有重大贡献的重要区域，而通道注意机制调整了每个特征通道的重要性，确保了空间和通道特定信息的有效利用。此外，为了增强模型的鲁棒性，我们将高效加性注意机制与多尺度前馈网络结合在一起，旨在降低计算成本而不影响性能。最后，采用双分支结构进行斑块加权质量预测，根据单个斑块的加权分数得出最终的质量分数。在四个广泛使用的基准数据集上进行的大量实验评估表明，所提出的方法在性能和计算效率方面都超过了几种最先进的NR-IQA方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems