Video Quality Analysis: Steps towards Unifying Full and No Reference Cases

Standards Pub Date : 2022-09-01 DOI:10.3390/standards2030027

P. Topiwala, W. Dai, J. Pian, Katalina Biondi, Arvind Krovvidi

{"title":"Video Quality Analysis: Steps towards Unifying Full and No Reference Cases","authors":"P. Topiwala, W. Dai, J. Pian, Katalina Biondi, Arvind Krovvidi","doi":"10.3390/standards2030027","DOIUrl":null,"url":null,"abstract":"Video quality assessment (VQA) is now a fast-growing field, maturing in the full reference (FR) case, yet challenging in the exploding no reference (NR) case. In this paper, we investigate some variants of the popular FR VMAF video quality assessment algorithm, using both support vector regression and feedforward neural networks. We also extend it to the NR case, using different features but similar learning, to develop a partially unified framework for VQA. When fully trained, FR algorithms such as VMAF perform very well on test datasets, reaching a 90%+ match in the popular correlation coefficients PCC and SRCC. However, for predicting performance in the wild, we train/test them individually for each dataset. With an 80/20 train/test split, we still achieve about 90% performance on average in both PCC and SRCC, with up to 7–9% gains over VMAF, using an improved motion feature and better regression. Moreover, we even obtain good performance (about 75%) if we ignore the reference, treating FR as NR, partly justifying our attempts at unification. In the true NR case, typically with amateur user-generated data, we avail of many more features, but still reduce complexity vs. recent algorithms VIDEVAL and RAPIQUE, while achieving performance within 3–5% of them. Moreover, we develop a method to analyze the saliency of features, and conclude that for both VIDEVAL and RAPIQUE, a small subset of their features provide the bulk of the performance. We also touch upon the current best NR methods: MDT-VSFA, and PVQ which reach above 80% performance. In short, we identify encouraging improvements in trainability in FR, while constraining training complexity against leading methods in NR, elucidating the saliency of features for feature selection.","PeriodicalId":21933,"journal":{"name":"Standards","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/standards2030027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Video quality assessment (VQA) is now a fast-growing field, maturing in the full reference (FR) case, yet challenging in the exploding no reference (NR) case. In this paper, we investigate some variants of the popular FR VMAF video quality assessment algorithm, using both support vector regression and feedforward neural networks. We also extend it to the NR case, using different features but similar learning, to develop a partially unified framework for VQA. When fully trained, FR algorithms such as VMAF perform very well on test datasets, reaching a 90%+ match in the popular correlation coefficients PCC and SRCC. However, for predicting performance in the wild, we train/test them individually for each dataset. With an 80/20 train/test split, we still achieve about 90% performance on average in both PCC and SRCC, with up to 7–9% gains over VMAF, using an improved motion feature and better regression. Moreover, we even obtain good performance (about 75%) if we ignore the reference, treating FR as NR, partly justifying our attempts at unification. In the true NR case, typically with amateur user-generated data, we avail of many more features, but still reduce complexity vs. recent algorithms VIDEVAL and RAPIQUE, while achieving performance within 3–5% of them. Moreover, we develop a method to analyze the saliency of features, and conclude that for both VIDEVAL and RAPIQUE, a small subset of their features provide the bulk of the performance. We also touch upon the current best NR methods: MDT-VSFA, and PVQ which reach above 80% performance. In short, we identify encouraging improvements in trainability in FR, while constraining training complexity against leading methods in NR, elucidating the saliency of features for feature selection.

查看原文本刊更多论文

视频质量分析:统一完整和无参考案例的步骤

视频质量评估(VQA)现在是一个快速发展的领域，在完全参考(FR)的情况下日趋成熟，但在爆炸式无参考(NR)的情况下面临挑战。在本文中，我们研究了流行的FR VMAF视频质量评估算法的一些变体，使用支持向量回归和前馈神经网络。我们还将其扩展到NR案例，使用不同的功能但类似的学习，为VQA开发部分统一的框架。经过充分训练后，像VMAF这样的FR算法在测试数据集上表现非常好，在流行的相关系数PCC和SRCC中达到90%以上的匹配。然而，为了在野外预测性能，我们为每个数据集单独训练/测试它们。使用80/20的训练/测试分割，我们仍然在PCC和SRCC中平均达到约90%的性能，使用改进的运动特征和更好的回归，比VMAF提高了7-9%。此外，如果我们忽略参考，将FR视为NR，我们甚至可以获得良好的性能(约75%)，这在一定程度上证明了我们尝试统一的合理性。在真正的NR情况下，通常使用业余用户生成的数据，我们利用了更多的特征，但与最近的算法VIDEVAL和RAPIQUE相比，仍然降低了复杂性，同时达到了3-5%的性能。此外，我们开发了一种方法来分析特征的显著性，并得出结论，对于VIDEVAL和RAPIQUE来说，它们的一小部分特征提供了大部分性能。我们还介绍了目前最好的NR方法:MDT-VSFA和PVQ，它们的性能达到80%以上。简而言之，我们确定了FR中可训练性的令人鼓舞的改进，同时限制了NR中领先方法的训练复杂性，阐明了特征选择的显著性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Standards

自引率

0.00%

发文量