基于多尺度动态调制和退化信息的无参考图像质量评价

IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Yongcan Zhao , Yinghao Zhang , Tianfeng Xia , Tianhuan Huang , Xianye Ben , Lei Chen
{"title":"基于多尺度动态调制和退化信息的无参考图像质量评价","authors":"Yongcan Zhao ,&nbsp;Yinghao Zhang ,&nbsp;Tianfeng Xia ,&nbsp;Tianhuan Huang ,&nbsp;Xianye Ben ,&nbsp;Lei Chen","doi":"10.1016/j.displa.2025.103207","DOIUrl":null,"url":null,"abstract":"<div><div>Image quality assessment is a fundamental problem in image processing, but the complex and varied distortions present in real-world images often affect the model for accurate quality scoring. To address these issues, this paper presents a novel no-reference image quality assessment method based on multi-scale dynamic modulation and gated fusion (MDM-GFIQA), which jointly captures and fuses degradation and distortion features to predict image quality scores more accurately. Specifically, shallow features are first extracted using a pre-trained feature extractor. To explore more deeply perceptual distortion features, we introduce the multi-scale adaptive feature modulation (MsAFM) block into the perceptual network. The MsAFM processes spatial information at different scales in parallel through multiple channels and combines with a multi-branch convolutional block (MBCB), which enables the network sensitive to local features and global information. The comparative learning auxiliary branch (CLAB) is constructed by supervised contrast learning to acquire rich degraded features for guiding the distorted features extracted by the perceptual network. The outputs of these two streams are then merged by our proposed dynamic fusion enhancement module (DFEM), which focuses on key distortion information before passing the fused features to a regression network that predicts the final quality score. Extensive experiments on seven publicly available databases demonstrate the superior performance of the proposed model over several state-of-the-art methods, i.e., achieving the SRCC values of 0.929 (vs. 0.898 in TID2013) and 0.887 (vs. 0.875 in LIVEC).</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103207"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"No-reference image quality assessment based on multi-scale dynamic modulation and degradation information\",\"authors\":\"Yongcan Zhao ,&nbsp;Yinghao Zhang ,&nbsp;Tianfeng Xia ,&nbsp;Tianhuan Huang ,&nbsp;Xianye Ben ,&nbsp;Lei Chen\",\"doi\":\"10.1016/j.displa.2025.103207\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Image quality assessment is a fundamental problem in image processing, but the complex and varied distortions present in real-world images often affect the model for accurate quality scoring. To address these issues, this paper presents a novel no-reference image quality assessment method based on multi-scale dynamic modulation and gated fusion (MDM-GFIQA), which jointly captures and fuses degradation and distortion features to predict image quality scores more accurately. Specifically, shallow features are first extracted using a pre-trained feature extractor. To explore more deeply perceptual distortion features, we introduce the multi-scale adaptive feature modulation (MsAFM) block into the perceptual network. The MsAFM processes spatial information at different scales in parallel through multiple channels and combines with a multi-branch convolutional block (MBCB), which enables the network sensitive to local features and global information. The comparative learning auxiliary branch (CLAB) is constructed by supervised contrast learning to acquire rich degraded features for guiding the distorted features extracted by the perceptual network. The outputs of these two streams are then merged by our proposed dynamic fusion enhancement module (DFEM), which focuses on key distortion information before passing the fused features to a regression network that predicts the final quality score. Extensive experiments on seven publicly available databases demonstrate the superior performance of the proposed model over several state-of-the-art methods, i.e., achieving the SRCC values of 0.929 (vs. 0.898 in TID2013) and 0.887 (vs. 0.875 in LIVEC).</div></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"91 \",\"pages\":\"Article 103207\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938225002446\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002446","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

图像质量评估是图像处理中的一个基本问题,但真实图像中存在的复杂多样的失真往往会影响到准确的质量评分模型。针对这些问题,本文提出了一种基于多尺度动态调制和门控融合(MDM-GFIQA)的无参考图像质量评估方法,该方法联合捕获和融合退化和失真特征,以更准确地预测图像质量分数。具体来说,首先使用预训练的特征提取器提取浅层特征。为了探索更深层次的感知失真特征,我们在感知网络中引入了多尺度自适应特征调制(MsAFM)块。MsAFM通过多通道并行处理不同尺度的空间信息,并结合多分支卷积块(MBCB),使网络对局部特征和全局信息敏感。通过监督对比学习构造比较学习辅助分支(CLAB),获取丰富的退化特征,用于指导感知网络提取的扭曲特征。然后,我们提出的动态融合增强模块(DFEM)将这两个流的输出合并,该模块将重点放在关键失真信息上,然后将融合的特征传递给预测最终质量分数的回归网络。在7个公开可用的数据库上进行的大量实验表明,所提出的模型优于几种最先进的方法,即实现了0.929 (TID2013为0.898)和0.887 (LIVEC为0.875)的SRCC值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
No-reference image quality assessment based on multi-scale dynamic modulation and degradation information
Image quality assessment is a fundamental problem in image processing, but the complex and varied distortions present in real-world images often affect the model for accurate quality scoring. To address these issues, this paper presents a novel no-reference image quality assessment method based on multi-scale dynamic modulation and gated fusion (MDM-GFIQA), which jointly captures and fuses degradation and distortion features to predict image quality scores more accurately. Specifically, shallow features are first extracted using a pre-trained feature extractor. To explore more deeply perceptual distortion features, we introduce the multi-scale adaptive feature modulation (MsAFM) block into the perceptual network. The MsAFM processes spatial information at different scales in parallel through multiple channels and combines with a multi-branch convolutional block (MBCB), which enables the network sensitive to local features and global information. The comparative learning auxiliary branch (CLAB) is constructed by supervised contrast learning to acquire rich degraded features for guiding the distorted features extracted by the perceptual network. The outputs of these two streams are then merged by our proposed dynamic fusion enhancement module (DFEM), which focuses on key distortion information before passing the fused features to a regression network that predicts the final quality score. Extensive experiments on seven publicly available databases demonstrate the superior performance of the proposed model over several state-of-the-art methods, i.e., achieving the SRCC values of 0.929 (vs. 0.898 in TID2013) and 0.887 (vs. 0.875 in LIVEC).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信