基于变换器的光场几何学习,用于无参考光场图像质量评估

IF 3.2 1区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Lili Lin;Siyu Bai;Mengjia Qu;Xuehui Wei;Luyao Wang;Feifan Wu;Biao Liu;Wenhui Zhou;Ercan Engin Kuruoglu
{"title":"基于变换器的光场几何学习,用于无参考光场图像质量评估","authors":"Lili Lin;Siyu Bai;Mengjia Qu;Xuehui Wei;Luyao Wang;Feifan Wu;Biao Liu;Wenhui Zhou;Ercan Engin Kuruoglu","doi":"10.1109/TBC.2024.3353579","DOIUrl":null,"url":null,"abstract":"Elevating traditional 2-dimensional (2D) plane display to 4-dimensional (4D) light field display can significantly enhance users’ immersion and realism, because light field image (LFI) provides various visual cues in terms of multi-view disparity, motion disparity, and selective focus. Therefore, it is crucial to establish a light field image quality assessment (LF-IQA) model that aligns with human visual perception characteristics. However, it has always been a challenge to evaluate the perceptual quality of multiple light field visual cues simultaneously and consistently. To this end, this paper proposes a Transformer-based explicit learning of light field geometry for the no-reference light field image quality assessment. Specifically, to explicitly learn the light field epipolar geometry, we stack up light field sub-aperture images (SAIs) to form four SAI stacks according to four specific light field angular directions, and use a sub-grouping strategy to hierarchically learn the local and global light field geometric features. Then, a Transformer encoder with a spatial-shift tokenization strategy is applied to learn structure-aware light field geometric distortion representation, which is used to regress the final quality score. Evaluation experiments are carried out on three commonly used light field image quality assessment datasets: Win5-LID, NBU-LF1.0, and MPI-LFA. Experimental results demonstrate that our model outperforms state-of-the-art methods and exhibits a high correlation with human perception. The source code is publicly available at \n<uri>https://github.com/windyz77/GeoNRLFIQA</uri>\n.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"597-606"},"PeriodicalIF":3.2000,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformer-Based Light Field Geometry Learning for No-Reference Light Field Image Quality Assessment\",\"authors\":\"Lili Lin;Siyu Bai;Mengjia Qu;Xuehui Wei;Luyao Wang;Feifan Wu;Biao Liu;Wenhui Zhou;Ercan Engin Kuruoglu\",\"doi\":\"10.1109/TBC.2024.3353579\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Elevating traditional 2-dimensional (2D) plane display to 4-dimensional (4D) light field display can significantly enhance users’ immersion and realism, because light field image (LFI) provides various visual cues in terms of multi-view disparity, motion disparity, and selective focus. Therefore, it is crucial to establish a light field image quality assessment (LF-IQA) model that aligns with human visual perception characteristics. However, it has always been a challenge to evaluate the perceptual quality of multiple light field visual cues simultaneously and consistently. To this end, this paper proposes a Transformer-based explicit learning of light field geometry for the no-reference light field image quality assessment. Specifically, to explicitly learn the light field epipolar geometry, we stack up light field sub-aperture images (SAIs) to form four SAI stacks according to four specific light field angular directions, and use a sub-grouping strategy to hierarchically learn the local and global light field geometric features. Then, a Transformer encoder with a spatial-shift tokenization strategy is applied to learn structure-aware light field geometric distortion representation, which is used to regress the final quality score. Evaluation experiments are carried out on three commonly used light field image quality assessment datasets: Win5-LID, NBU-LF1.0, and MPI-LFA. Experimental results demonstrate that our model outperforms state-of-the-art methods and exhibits a high correlation with human perception. The source code is publicly available at \\n<uri>https://github.com/windyz77/GeoNRLFIQA</uri>\\n.\",\"PeriodicalId\":13159,\"journal\":{\"name\":\"IEEE Transactions on Broadcasting\",\"volume\":\"70 2\",\"pages\":\"597-606\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-01-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Broadcasting\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10418048/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10418048/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

将传统的二维(2D)平面显示提升到四维(4D)光场显示,可以显著增强用户的沉浸感和真实感,因为光场图像(LFI)提供了多视角差异、运动差异和选择性聚焦等多种视觉线索。因此,建立一个符合人类视觉感知特征的光场图像质量评估(LF-IQA)模型至关重要。然而,如何同时、一致地评估多个光场视觉线索的感知质量一直是个难题。为此,本文提出了一种基于变换器的光场几何显式学习方法,用于无参照光场图像质量评估。具体来说,为了显式学习光场外极几何,我们将光场子孔径图像(SAI)按照四个特定的光场角度方向堆叠成四个 SAI 堆栈,并使用子分组策略分层学习局部和全局光场几何特征。然后,采用空间偏移标记化策略的变换器编码器学习结构感知光场几何失真表示,并以此回归最终质量得分。评估实验在三个常用的光场图像质量评估数据集上进行:Win5-LID、NBU-LF1.0 和 MPI-LFA。实验结果表明,我们的模型优于最先进的方法,并且与人类感知具有很高的相关性。源代码可通过 https://github.com/windyz77/GeoNRLFIQA 公开获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Transformer-Based Light Field Geometry Learning for No-Reference Light Field Image Quality Assessment
Elevating traditional 2-dimensional (2D) plane display to 4-dimensional (4D) light field display can significantly enhance users’ immersion and realism, because light field image (LFI) provides various visual cues in terms of multi-view disparity, motion disparity, and selective focus. Therefore, it is crucial to establish a light field image quality assessment (LF-IQA) model that aligns with human visual perception characteristics. However, it has always been a challenge to evaluate the perceptual quality of multiple light field visual cues simultaneously and consistently. To this end, this paper proposes a Transformer-based explicit learning of light field geometry for the no-reference light field image quality assessment. Specifically, to explicitly learn the light field epipolar geometry, we stack up light field sub-aperture images (SAIs) to form four SAI stacks according to four specific light field angular directions, and use a sub-grouping strategy to hierarchically learn the local and global light field geometric features. Then, a Transformer encoder with a spatial-shift tokenization strategy is applied to learn structure-aware light field geometric distortion representation, which is used to regress the final quality score. Evaluation experiments are carried out on three commonly used light field image quality assessment datasets: Win5-LID, NBU-LF1.0, and MPI-LFA. Experimental results demonstrate that our model outperforms state-of-the-art methods and exhibits a high correlation with human perception. The source code is publicly available at https://github.com/windyz77/GeoNRLFIQA .
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Broadcasting
IEEE Transactions on Broadcasting 工程技术-电信学
CiteScore
9.40
自引率
31.10%
发文量
79
审稿时长
6-12 weeks
期刊介绍: The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信