用于盲法图像质量评估的注意力向下采样变换器、相对排序和自一致性

arXiv - EE - Image and Video Processing Pub Date : 2024-09-11 DOI:arxiv-2409.07115

Mohammed Alsaafin, Musab Alsheikh, Saeed Anwar, Muhammad Usman

{"title":"用于盲法图像质量评估的注意力向下采样变换器、相对排序和自一致性","authors":"Mohammed Alsaafin, Musab Alsheikh, Saeed Anwar, Muhammad Usman","doi":"arxiv-2409.07115","DOIUrl":null,"url":null,"abstract":"The no-reference image quality assessment is a challenging domain that\naddresses estimating image quality without the original reference. We introduce\nan improved mechanism to extract local and non-local information from images\nvia different transformer encoders and CNNs. The utilization of Transformer\nencoders aims to mitigate locality bias and generate a non-local representation\nby sequentially processing CNN features, which inherently capture local visual\nstructures. Establishing a stronger connection between subjective and objective\nassessments is achieved through sorting within batches of images based on\nrelative distance information. A self-consistency approach to self-supervision\nis presented, explicitly addressing the degradation of no-reference image\nquality assessment (NR-IQA) models under equivariant transformations. Our\napproach ensures model robustness by maintaining consistency between an image\nand its horizontally flipped equivalent. Through empirical evaluation of five\npopular image quality assessment datasets, the proposed model outperforms\nalternative algorithms in the context of no-reference image quality assessment\ndatasets, especially on smaller datasets. Codes are available at\n\\href{https://github.com/mas94/ADTRS}{https://github.com/mas94/ADTRS}","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Attention Down-Sampling Transformer, Relative Ranking and Self-Consistency for Blind Image Quality Assessment\",\"authors\":\"Mohammed Alsaafin, Musab Alsheikh, Saeed Anwar, Muhammad Usman\",\"doi\":\"arxiv-2409.07115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The no-reference image quality assessment is a challenging domain that\\naddresses estimating image quality without the original reference. We introduce\\nan improved mechanism to extract local and non-local information from images\\nvia different transformer encoders and CNNs. The utilization of Transformer\\nencoders aims to mitigate locality bias and generate a non-local representation\\nby sequentially processing CNN features, which inherently capture local visual\\nstructures. Establishing a stronger connection between subjective and objective\\nassessments is achieved through sorting within batches of images based on\\nrelative distance information. A self-consistency approach to self-supervision\\nis presented, explicitly addressing the degradation of no-reference image\\nquality assessment (NR-IQA) models under equivariant transformations. Our\\napproach ensures model robustness by maintaining consistency between an image\\nand its horizontally flipped equivalent. Through empirical evaluation of five\\npopular image quality assessment datasets, the proposed model outperforms\\nalternative algorithms in the context of no-reference image quality assessment\\ndatasets, especially on smaller datasets. Codes are available at\\n\\\\href{https://github.com/mas94/ADTRS}{https://github.com/mas94/ADTRS}\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":\"2 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

无参照物图像质量评估是一个具有挑战性的领域，它涉及在没有原始参照物的情况下估计图像质量。我们引入了一种改进机制，通过不同的变换编码器和 CNN 从图像中提取本地和非本地信息。变压器编码器的使用旨在减轻局部性偏差，并通过连续处理 CNN 特征生成非局部表示，而 CNN 本身就能捕捉局部视觉结构。根据相对距离信息在成批图像中进行排序，从而在主观评价和客观评价之间建立更强的联系。我们提出了一种自我监督的自一致性方法，明确解决了无参考图像质量评估（NR-IQA）模型在等变量变换下的退化问题。我们的方法通过保持图像与其水平翻转等效图像之间的一致性来确保模型的稳健性。通过对五个流行的图像质量评估数据集进行实证评估，在无参考图像质量评估数据集的情况下，所提出的模型优于其他算法，尤其是在较小的数据集上。代码见：href{https://github.com/mas94/ADTRS}{https://github.com/mas94/ADTRS}。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Attention Down-Sampling Transformer, Relative Ranking and Self-Consistency for Blind Image Quality Assessment

The no-reference image quality assessment is a challenging domain that addresses estimating image quality without the original reference. We introduce an improved mechanism to extract local and non-local information from images via different transformer encoders and CNNs. The utilization of Transformer encoders aims to mitigate locality bias and generate a non-local representation by sequentially processing CNN features, which inherently capture local visual structures. Establishing a stronger connection between subjective and objective assessments is achieved through sorting within batches of images based on relative distance information. A self-consistency approach to self-supervision is presented, explicitly addressing the degradation of no-reference image quality assessment (NR-IQA) models under equivariant transformations. Our approach ensures model robustness by maintaining consistency between an image and its horizontally flipped equivalent. Through empirical evaluation of five popular image quality assessment datasets, the proposed model outperforms alternative algorithms in the context of no-reference image quality assessment datasets, especially on smaller datasets. Codes are available at \href{https://github.com/mas94/ADTRS}{https://github.com/mas94/ADTRS}

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Image and Video Processing

自引率

0.00%

发文量