P. Topiwala, W. Dai, J. Pian, Katalina Biondi, Arvind Krovvidi
{"title":"Video Quality Analysis: Steps towards Unifying Full and No Reference Cases","authors":"P. Topiwala, W. Dai, J. Pian, Katalina Biondi, Arvind Krovvidi","doi":"10.3390/standards2030027","DOIUrl":null,"url":null,"abstract":"Video quality assessment (VQA) is now a fast-growing field, maturing in the full reference (FR) case, yet challenging in the exploding no reference (NR) case. In this paper, we investigate some variants of the popular FR VMAF video quality assessment algorithm, using both support vector regression and feedforward neural networks. We also extend it to the NR case, using different features but similar learning, to develop a partially unified framework for VQA. When fully trained, FR algorithms such as VMAF perform very well on test datasets, reaching a 90%+ match in the popular correlation coefficients PCC and SRCC. However, for predicting performance in the wild, we train/test them individually for each dataset. With an 80/20 train/test split, we still achieve about 90% performance on average in both PCC and SRCC, with up to 7–9% gains over VMAF, using an improved motion feature and better regression. Moreover, we even obtain good performance (about 75%) if we ignore the reference, treating FR as NR, partly justifying our attempts at unification. In the true NR case, typically with amateur user-generated data, we avail of many more features, but still reduce complexity vs. recent algorithms VIDEVAL and RAPIQUE, while achieving performance within 3–5% of them. Moreover, we develop a method to analyze the saliency of features, and conclude that for both VIDEVAL and RAPIQUE, a small subset of their features provide the bulk of the performance. We also touch upon the current best NR methods: MDT-VSFA, and PVQ which reach above 80% performance. In short, we identify encouraging improvements in trainability in FR, while constraining training complexity against leading methods in NR, elucidating the saliency of features for feature selection.","PeriodicalId":21933,"journal":{"name":"Standards","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/standards2030027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Video quality assessment (VQA) is now a fast-growing field, maturing in the full reference (FR) case, yet challenging in the exploding no reference (NR) case. In this paper, we investigate some variants of the popular FR VMAF video quality assessment algorithm, using both support vector regression and feedforward neural networks. We also extend it to the NR case, using different features but similar learning, to develop a partially unified framework for VQA. When fully trained, FR algorithms such as VMAF perform very well on test datasets, reaching a 90%+ match in the popular correlation coefficients PCC and SRCC. However, for predicting performance in the wild, we train/test them individually for each dataset. With an 80/20 train/test split, we still achieve about 90% performance on average in both PCC and SRCC, with up to 7–9% gains over VMAF, using an improved motion feature and better regression. Moreover, we even obtain good performance (about 75%) if we ignore the reference, treating FR as NR, partly justifying our attempts at unification. In the true NR case, typically with amateur user-generated data, we avail of many more features, but still reduce complexity vs. recent algorithms VIDEVAL and RAPIQUE, while achieving performance within 3–5% of them. Moreover, we develop a method to analyze the saliency of features, and conclude that for both VIDEVAL and RAPIQUE, a small subset of their features provide the bulk of the performance. We also touch upon the current best NR methods: MDT-VSFA, and PVQ which reach above 80% performance. In short, we identify encouraging improvements in trainability in FR, while constraining training complexity against leading methods in NR, elucidating the saliency of features for feature selection.