Luminance decomposition and reconstruction for high dynamic range Video Quality Assessment

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2024-09-12 DOI:10.1016/j.patcog.2024.111011

{"title":"Luminance decomposition and reconstruction for high dynamic range Video Quality Assessment","authors":"","doi":"10.1016/j.patcog.2024.111011","DOIUrl":null,"url":null,"abstract":"<div><p>High dynamic range (HDR) video represents a wider range of brightness, detail and colour than standard dynamic range (SDR) video. However, SDR-based VQA (Video Quality Assessment) models struggle to capture HDR distortions. In addition, some of the existing methods designed for HDR video focus on emphasising the distortion of local areas of the video frame, ignoring the distortion of the video frame as a whole. Therefore, we propose a no reference VQA model based on luminance decomposition and recombination that provides excellent performance for HDR videos, called HDR-DRVQA. Specifically, HDR-DRVQA utilises a luminance decomposition strategy to decompose video frames into different regions for explicit extraction of perceptual features in different regions of the high dynamic range. We then further propose a residual aggregation module for recombining multi-region features to extract static spatial distortion representations and dynamic motion perception (captured by feature differences). Taking advantage of the Transformer network in remote dependency modelling, this information is fed into the Transformer network for interactive learning of motion perception and adaptively constructs a stream of spatial distortion information from shallow to deep layers during temporal aggregation. We validate that our model significantly outperforms SDR VQA and existing HDR VQA methods on the publicly available HDR databases.</p></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324007623","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

High dynamic range (HDR) video represents a wider range of brightness, detail and colour than standard dynamic range (SDR) video. However, SDR-based VQA (Video Quality Assessment) models struggle to capture HDR distortions. In addition, some of the existing methods designed for HDR video focus on emphasising the distortion of local areas of the video frame, ignoring the distortion of the video frame as a whole. Therefore, we propose a no reference VQA model based on luminance decomposition and recombination that provides excellent performance for HDR videos, called HDR-DRVQA. Specifically, HDR-DRVQA utilises a luminance decomposition strategy to decompose video frames into different regions for explicit extraction of perceptual features in different regions of the high dynamic range. We then further propose a residual aggregation module for recombining multi-region features to extract static spatial distortion representations and dynamic motion perception (captured by feature differences). Taking advantage of the Transformer network in remote dependency modelling, this information is fed into the Transformer network for interactive learning of motion perception and adaptively constructs a stream of spatial distortion information from shallow to deep layers during temporal aggregation. We validate that our model significantly outperforms SDR VQA and existing HDR VQA methods on the publicly available HDR databases.

查看原文本刊更多论文

用于高动态范围视频质量评估的亮度分解和重建

与标准动态范围（SDR）视频相比，高动态范围（HDR）视频具有更宽的亮度、细节和色彩范围。然而，基于 SDR 的 VQA（视频质量评估）模型难以捕捉 HDR 失真。此外，一些针对 HDR 视频设计的现有方法侧重于强调视频帧局部区域的失真，而忽略了视频帧整体的失真。因此，我们提出了一种基于亮度分解和重组的无参考 VQA 模型，它能为 HDR 视频提供出色的性能，称为 HDR-DRVQA。具体来说，HDR-DRVQA 利用亮度分解策略将视频帧分解成不同的区域，以明确提取高动态范围不同区域的感知特征。然后，我们进一步提出了一个残差聚合模块，用于重新组合多区域特征，以提取静态空间失真表示和动态运动感知（通过特征差异捕捉）。利用远程依赖建模中 Transformer 网络的优势，这些信息被输入 Transformer 网络，用于运动感知的交互式学习，并在时间聚合过程中自适应地构建从浅层到深层的空间失真信息流。我们在公开的 HDR 数据库上验证了我们的模型明显优于 SDR VQA 和现有的 HDR VQA 方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.

文献相关原料

公司名称	产品信息	采购帮参考价格