Multi-Task Guided No-Reference Omnidirectional Image Quality Assessment With Feature Interaction

IF 11.1 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2025-03-17 DOI:10.1109/TCSVT.2025.3551723

Yun Liu;Sifan Li;Huiyu Duan;Yu Zhou;Daoxin Fan;Guangtao Zhai

{"title":"Multi-Task Guided No-Reference Omnidirectional Image Quality Assessment With Feature Interaction","authors":"Yun Liu;Sifan Li;Huiyu Duan;Yu Zhou;Daoxin Fan;Guangtao Zhai","doi":"10.1109/TCSVT.2025.3551723","DOIUrl":null,"url":null,"abstract":"Omnidirectional image quality assessment (OIQA) has become an increasingly vital problem in recent years. Most previous no-reference OIQA methods only extract local features from the distorted viewports, or extract global features from the entire distorted image, lacking the interaction and fusion between local and global features. Moreover, the lack of reference information also limits their performance. Thus, we propose a no-reference OIQA model which consists of three novel modules, including a bidirectional pseudo-reference module, a Mamba-based global feature extraction module, and a multi-scale local-global feature aggregation module. Specifically, by considering the image distortion degradation process, a bidirectional pseudo-reference module capturing the error maps on viewports is first constructed to refine the multi-scale local visual features, which can supply rich quality degradation reference information without the reference image. To well complement the local features, the VMamba module is adopted to extract the representative multi-scale global visual features. Inspired by human hierarchical visual perception characteristics, a novel multi-scale aggregation module is built to strengthen the feature interaction and effective fusion which can extract deep semantic information. Finally, motivated by the multi-task managing mechanism of human brain, a multi-task learning module is introduced to assist the main quality assessment task by digging the hidden information in compression type and distortion degree. Extensive experimental results demonstrate that our proposed method achieves the state-of-the-art performance on the no-reference OIQA task compared to other models.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"8794-8806"},"PeriodicalIF":11.1000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10929024/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Omnidirectional image quality assessment (OIQA) has become an increasingly vital problem in recent years. Most previous no-reference OIQA methods only extract local features from the distorted viewports, or extract global features from the entire distorted image, lacking the interaction and fusion between local and global features. Moreover, the lack of reference information also limits their performance. Thus, we propose a no-reference OIQA model which consists of three novel modules, including a bidirectional pseudo-reference module, a Mamba-based global feature extraction module, and a multi-scale local-global feature aggregation module. Specifically, by considering the image distortion degradation process, a bidirectional pseudo-reference module capturing the error maps on viewports is first constructed to refine the multi-scale local visual features, which can supply rich quality degradation reference information without the reference image. To well complement the local features, the VMamba module is adopted to extract the representative multi-scale global visual features. Inspired by human hierarchical visual perception characteristics, a novel multi-scale aggregation module is built to strengthen the feature interaction and effective fusion which can extract deep semantic information. Finally, motivated by the multi-task managing mechanism of human brain, a multi-task learning module is introduced to assist the main quality assessment task by digging the hidden information in compression type and distortion degree. Extensive experimental results demonstrate that our proposed method achieves the state-of-the-art performance on the no-reference OIQA task compared to other models.

查看原文本刊更多论文

基于特征交互的多任务引导无参考全方位图像质量评估

近年来，全方位图像质量评估（OIQA）已成为一个日益重要的问题。以往的无参考OIQA方法大多只从畸变视口中提取局部特征，或者从整个畸变图像中提取全局特征，缺乏局部特征与全局特征的交互和融合。此外，参考信息的缺乏也限制了它们的性能。为此，我们提出了一种无参考OIQA模型，该模型由三个新模块组成，包括双向伪参考模块、基于mamba的全局特征提取模块和多尺度局部-全局特征聚合模块。具体而言，考虑图像失真退化过程，首先构建捕获视口上误差映射的双向伪参考模块，对多尺度局部视觉特征进行细化，可以在没有参考图像的情况下提供丰富的质量退化参考信息。为了更好地补充局部特征，采用vammba模块提取具有代表性的多尺度全局视觉特征。受人类层次视觉感知特征的启发，构建了一种新型的多尺度聚合模块，加强特征交互和有效融合，提取深度语义信息。最后，在人脑多任务管理机制的激励下，引入多任务学习模块，通过挖掘压缩类型和扭曲程度的隐藏信息来辅助主质量评估任务。大量的实验结果表明，与其他模型相比，我们提出的方法在无参考OIQA任务上取得了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.