MFCT: Multi-Frequency Cascade Transformers for no-reference SR-IQA

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Vision and Image Understanding Pub Date : 2024-08-08 DOI:10.1016/j.cviu.2024.104104

{"title":"MFCT: Multi-Frequency Cascade Transformers for no-reference SR-IQA","authors":"","doi":"10.1016/j.cviu.2024.104104","DOIUrl":null,"url":null,"abstract":"<div><p>Super-resolution image reconstruction techniques have advanced quickly, leading to the generation of a sizable number of super-resolution images using different super-resolution techniques. Nevertheless, accurately assessing the quality of super-resolution images remains a formidable challenge. This paper introduces a novel Multi-Frequency Cascade Transformers (MFCT) for evaluating super-resolution image quality (SR-IQA). In the first step, we develop a unique Frequency-Divided Module (FDM) to transform the super-resolution images into three different frequency bands. Subsequently, the Cascade Transformer Blocks (CAF) incorporating hierarchical self-attention mechanisms are employed to capture cross-window features for quality perception. Ultimately, the image quality scores from different frequency bands are fused to derive the overall image quality score. The experimental results show that, on the chosen SR-IQA databases, the proposed MFCT-based SR-IQA method can consistently outperforms all the compared Image Quality Assessment (IQA) models. Furthermore, a collection of thorough ablation studies demonstrates that, when compared to other earlier rivals, the newly proposed approach exhibits impressive generalization ability. The code will be available at <span><span>https://github.com/kbzhang0505/MFCT</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224001851","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Super-resolution image reconstruction techniques have advanced quickly, leading to the generation of a sizable number of super-resolution images using different super-resolution techniques. Nevertheless, accurately assessing the quality of super-resolution images remains a formidable challenge. This paper introduces a novel Multi-Frequency Cascade Transformers (MFCT) for evaluating super-resolution image quality (SR-IQA). In the first step, we develop a unique Frequency-Divided Module (FDM) to transform the super-resolution images into three different frequency bands. Subsequently, the Cascade Transformer Blocks (CAF) incorporating hierarchical self-attention mechanisms are employed to capture cross-window features for quality perception. Ultimately, the image quality scores from different frequency bands are fused to derive the overall image quality score. The experimental results show that, on the chosen SR-IQA databases, the proposed MFCT-based SR-IQA method can consistently outperforms all the compared Image Quality Assessment (IQA) models. Furthermore, a collection of thorough ablation studies demonstrates that, when compared to other earlier rivals, the newly proposed approach exhibits impressive generalization ability. The code will be available at https://github.com/kbzhang0505/MFCT.

Abstract Image

查看原文本刊更多论文

MFCT：用于无参考 SR-IQA 的多频级联变压器

超分辨率图像重建技术发展迅速，利用不同的超分辨率技术生成了大量超分辨率图像。然而，准确评估超分辨率图像的质量仍然是一项艰巨的挑战。本文介绍了一种用于评估超分辨率图像质量（SR-IQA）的新型多频级联变换器（MFCT）。首先，我们开发了一种独特的分频模块（FDM），将超分辨率图像转换成三个不同的频段。随后，我们采用包含分层自我注意机制的级联变换器块（CAF）来捕捉跨窗口特征，以实现质量感知。最后，融合不同频段的图像质量得分，得出整体图像质量得分。实验结果表明，在所选的 SR-IQA 数据库中，所提出的基于 MFCT 的 SR-IQA 方法始终优于所有比较过的图像质量评估 (IQA) 模型。此外，一系列彻底的消融研究表明，与其他早期竞争对手相比，新提出的方法表现出令人印象深刻的概括能力。代码可在 https://github.com/kbzhang0505/MFCT 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems

文献相关原料

公司名称	产品信息	采购帮参考价格