{"title":"为野外图像质量评估探索丰富的主观质量信息","authors":"Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang, Huifang Sun, Chang Wen Chen","doi":"arxiv-2409.05540","DOIUrl":null,"url":null,"abstract":"Traditional in the wild image quality assessment (IQA) models are generally\ntrained with the quality labels of mean opinion score (MOS), while missing the\nrich subjective quality information contained in the quality ratings, for\nexample, the standard deviation of opinion scores (SOS) or even distribution of\nopinion scores (DOS). In this paper, we propose a novel IQA method named\nRichIQA to explore the rich subjective rating information beyond MOS to predict\nimage quality in the wild. RichIQA is characterized by two key novel designs:\n(1) a three-stage image quality prediction network which exploits the powerful\nfeature representation capability of the Convolutional vision Transformer (CvT)\nand mimics the short-term and long-term memory mechanisms of human brain; (2) a\nmulti-label training strategy in which rich subjective quality information like\nMOS, SOS and DOS are concurrently used to train the quality prediction network.\nPowered by these two novel designs, RichIQA is able to predict the image\nquality in terms of a distribution, from which the mean image quality can be\nsubsequently obtained. Extensive experimental results verify that the\nthree-stage network is tailored to predict rich quality information, while the\nmulti-label training strategy can fully exploit the potentials within\nsubjective quality rating and enhance the prediction performance and\ngeneralizability of the network. RichIQA outperforms state-of-the-art\ncompetitors on multiple large-scale in the wild IQA databases with rich\nsubjective rating labels. The code of RichIQA will be made publicly available\non GitHub.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild\",\"authors\":\"Xiongkuo Min, Yixuan Gao, Yuqin Cao, Guangtao Zhai, Wenjun Zhang, Huifang Sun, Chang Wen Chen\",\"doi\":\"arxiv-2409.05540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional in the wild image quality assessment (IQA) models are generally\\ntrained with the quality labels of mean opinion score (MOS), while missing the\\nrich subjective quality information contained in the quality ratings, for\\nexample, the standard deviation of opinion scores (SOS) or even distribution of\\nopinion scores (DOS). In this paper, we propose a novel IQA method named\\nRichIQA to explore the rich subjective rating information beyond MOS to predict\\nimage quality in the wild. RichIQA is characterized by two key novel designs:\\n(1) a three-stage image quality prediction network which exploits the powerful\\nfeature representation capability of the Convolutional vision Transformer (CvT)\\nand mimics the short-term and long-term memory mechanisms of human brain; (2) a\\nmulti-label training strategy in which rich subjective quality information like\\nMOS, SOS and DOS are concurrently used to train the quality prediction network.\\nPowered by these two novel designs, RichIQA is able to predict the image\\nquality in terms of a distribution, from which the mean image quality can be\\nsubsequently obtained. Extensive experimental results verify that the\\nthree-stage network is tailored to predict rich quality information, while the\\nmulti-label training strategy can fully exploit the potentials within\\nsubjective quality rating and enhance the prediction performance and\\ngeneralizability of the network. RichIQA outperforms state-of-the-art\\ncompetitors on multiple large-scale in the wild IQA databases with rich\\nsubjective rating labels. The code of RichIQA will be made publicly available\\non GitHub.\",\"PeriodicalId\":501480,\"journal\":{\"name\":\"arXiv - CS - Multimedia\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05540\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploring Rich Subjective Quality Information for Image Quality Assessment in the Wild
Traditional in the wild image quality assessment (IQA) models are generally
trained with the quality labels of mean opinion score (MOS), while missing the
rich subjective quality information contained in the quality ratings, for
example, the standard deviation of opinion scores (SOS) or even distribution of
opinion scores (DOS). In this paper, we propose a novel IQA method named
RichIQA to explore the rich subjective rating information beyond MOS to predict
image quality in the wild. RichIQA is characterized by two key novel designs:
(1) a three-stage image quality prediction network which exploits the powerful
feature representation capability of the Convolutional vision Transformer (CvT)
and mimics the short-term and long-term memory mechanisms of human brain; (2) a
multi-label training strategy in which rich subjective quality information like
MOS, SOS and DOS are concurrently used to train the quality prediction network.
Powered by these two novel designs, RichIQA is able to predict the image
quality in terms of a distribution, from which the mean image quality can be
subsequently obtained. Extensive experimental results verify that the
three-stage network is tailored to predict rich quality information, while the
multi-label training strategy can fully exploit the potentials within
subjective quality rating and enhance the prediction performance and
generalizability of the network. RichIQA outperforms state-of-the-art
competitors on multiple large-scale in the wild IQA databases with rich
subjective rating labels. The code of RichIQA will be made publicly available
on GitHub.