基于对比学习的自监督全景拼接图像质量评价

IF 3.1 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation Pub Date : 2025-06-16 DOI:10.1016/j.jvcir.2025.104519

Xiaoer Li , Kexin Zhang , Feng Shao

{"title":"基于对比学习的自监督全景拼接图像质量评价","authors":"Xiaoer Li , Kexin Zhang , Feng Shao","doi":"10.1016/j.jvcir.2025.104519","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, with the rapid development of virtual reality technology and the advent of the 5G era, panoramic images have received increasingly widespread attention. Nowadays, researchers have proposed numerous image stitching algorithms. However, research on assessing the quality of stitched images is still relatively scarce. Furthermore, the stitching distortions introduced during the generation of panoramic content make the task of quality assessment even more challenging. In this paper, a new network for panoramic stitched image quality assessment is proposed. To be specific, this model contains two stages: the contrastive learning stage and the quality prediction stage. In the first stage, we introduce two pretext tasks as learning objectives: distortion type prediction and distortion level prediction. This allows the network to learn corresponding features from different viewpoints with varying distortion types and severities. During this process, we utilize prior knowledge of four pre-classified distortion types as category labels and three distortion severity levels as distortion severity labels to assist the pretext tasks. Subsequently, a universal convolutional neural network (CNN) model is trained using a pairwise comparison method. In the quality prediction stage, the trained CNN weights are frozen, and the learned feature representation is mapped to the final quality score through linear regression. We evaluate the proposed network on two benchmark databases and results demonstrate that the combination of two pretext tasks can obtain more accurate results. Overall, our method is superior to existing full-reference and no-reference models designed for 2D images and 360° panoramic stitched image quality assessment.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104519"},"PeriodicalIF":3.1000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-supervised panoramic stitched image quality assessment based on contrastive learning\",\"authors\":\"Xiaoer Li , Kexin Zhang , Feng Shao\",\"doi\":\"10.1016/j.jvcir.2025.104519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In recent years, with the rapid development of virtual reality technology and the advent of the 5G era, panoramic images have received increasingly widespread attention. Nowadays, researchers have proposed numerous image stitching algorithms. However, research on assessing the quality of stitched images is still relatively scarce. Furthermore, the stitching distortions introduced during the generation of panoramic content make the task of quality assessment even more challenging. In this paper, a new network for panoramic stitched image quality assessment is proposed. To be specific, this model contains two stages: the contrastive learning stage and the quality prediction stage. In the first stage, we introduce two pretext tasks as learning objectives: distortion type prediction and distortion level prediction. This allows the network to learn corresponding features from different viewpoints with varying distortion types and severities. During this process, we utilize prior knowledge of four pre-classified distortion types as category labels and three distortion severity levels as distortion severity labels to assist the pretext tasks. Subsequently, a universal convolutional neural network (CNN) model is trained using a pairwise comparison method. In the quality prediction stage, the trained CNN weights are frozen, and the learned feature representation is mapped to the final quality score through linear regression. We evaluate the proposed network on two benchmark databases and results demonstrate that the combination of two pretext tasks can obtain more accurate results. Overall, our method is superior to existing full-reference and no-reference models designed for 2D images and 360° panoramic stitched image quality assessment.</div></div>\",\"PeriodicalId\":54755,\"journal\":{\"name\":\"Journal of Visual Communication and Image Representation\",\"volume\":\"111 \",\"pages\":\"Article 104519\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Visual Communication and Image Representation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1047320325001336\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320325001336","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，随着虚拟现实技术的飞速发展和5G时代的到来，全景图像受到了越来越广泛的关注。目前，研究人员提出了许多图像拼接算法。然而，对缝合图像质量评估的研究仍然相对较少。此外，在全景内容生成过程中引入的拼接畸变使质量评估任务更具挑战性。本文提出了一种新的全景拼接图像质量评价网络。具体来说，该模型包含两个阶段：对比学习阶段和质量预测阶段。在第一阶段，我们引入两个借口任务作为学习目标：扭曲类型预测和扭曲水平预测。这使得网络可以从不同的角度学习到不同的特征，这些特征具有不同的失真类型和严重程度。在此过程中，我们利用四种预分类失真类型的先验知识作为类别标签，利用三种失真严重程度作为失真严重程度标签来辅助借口任务。随后，使用两两比较方法训练通用卷积神经网络（CNN）模型。在质量预测阶段，将训练好的CNN权值冻结，通过线性回归将学习到的特征表示映射到最终的质量分数。我们在两个基准数据库上对所提出的网络进行了评估，结果表明两个借口任务的组合可以获得更准确的结果。总的来说，我们的方法优于现有的二维图像和360°全景拼接图像质量评估的全参考和无参考模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-supervised panoramic stitched image quality assessment based on contrastive learning

In recent years, with the rapid development of virtual reality technology and the advent of the 5G era, panoramic images have received increasingly widespread attention. Nowadays, researchers have proposed numerous image stitching algorithms. However, research on assessing the quality of stitched images is still relatively scarce. Furthermore, the stitching distortions introduced during the generation of panoramic content make the task of quality assessment even more challenging. In this paper, a new network for panoramic stitched image quality assessment is proposed. To be specific, this model contains two stages: the contrastive learning stage and the quality prediction stage. In the first stage, we introduce two pretext tasks as learning objectives: distortion type prediction and distortion level prediction. This allows the network to learn corresponding features from different viewpoints with varying distortion types and severities. During this process, we utilize prior knowledge of four pre-classified distortion types as category labels and three distortion severity levels as distortion severity labels to assist the pretext tasks. Subsequently, a universal convolutional neural network (CNN) model is trained using a pairwise comparison method. In the quality prediction stage, the trained CNN weights are frozen, and the learned feature representation is mapped to the final quality score through linear regression. We evaluate the proposed network on two benchmark databases and results demonstrate that the combination of two pretext tasks can obtain more accurate results. Overall, our method is superior to existing full-reference and no-reference models designed for 2D images and 360° panoramic stitched image quality assessment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Visual Communication and Image Representation 工程技术-计算机：软件工程

CiteScore

5.40

自引率

11.50%

发文量

188

审稿时长

9.9 months

期刊介绍： The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.