面向光场重建的结构感知预选择神经渲染

IF 8.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Song Chang;Youfang Lin;Shuo Zhang
{"title":"面向光场重建的结构感知预选择神经渲染","authors":"Song Chang;Youfang Lin;Shuo Zhang","doi":"10.1109/TMM.2024.3521784","DOIUrl":null,"url":null,"abstract":"As densely-sampled Light Field (LF) images are beneficial to many applications, LF reconstruction becomes an important technology in related fields. Recently, neural rendering shows great potential in reconstruction tasks. However, volume rendering in existing methods needs to sample many points on the whole camera ray or epipolar line, which is time-consuming. In this paper, specifically for LF images with regular angular sampling, we propose a novel Structure-Aware Pre-Selected neural rendering framework for LF reconstruction. Instead of sampling on the whole epipolar line, we propose to sample on several specific positions, which are estimated using the color and inherent scene structure information explored in the regular angular sampled LF images. Sampling only a few points that closely match the target pixel, the feature of the target pixel is quickly rendered with high-quality. Finally, we fuse the features and decode them in the view dimension to obtain the final target view. Experiments show that the proposed method outperforms the state-of-the-art LF reconstruction methods in both qualitative and quantitative comparisons across various tasks. Our method also surpasses the most existing methods in terms of speed. Moreover, without any retraining or fine-tuning, the performance of our method with no-per-scene optimization is even better than the methods with per-scene optimization.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"1574-1587"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Structure-Aware Pre-Selected Neural Rendering for Light Field Reconstruction\",\"authors\":\"Song Chang;Youfang Lin;Shuo Zhang\",\"doi\":\"10.1109/TMM.2024.3521784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As densely-sampled Light Field (LF) images are beneficial to many applications, LF reconstruction becomes an important technology in related fields. Recently, neural rendering shows great potential in reconstruction tasks. However, volume rendering in existing methods needs to sample many points on the whole camera ray or epipolar line, which is time-consuming. In this paper, specifically for LF images with regular angular sampling, we propose a novel Structure-Aware Pre-Selected neural rendering framework for LF reconstruction. Instead of sampling on the whole epipolar line, we propose to sample on several specific positions, which are estimated using the color and inherent scene structure information explored in the regular angular sampled LF images. Sampling only a few points that closely match the target pixel, the feature of the target pixel is quickly rendered with high-quality. Finally, we fuse the features and decode them in the view dimension to obtain the final target view. Experiments show that the proposed method outperforms the state-of-the-art LF reconstruction methods in both qualitative and quantitative comparisons across various tasks. Our method also surpasses the most existing methods in terms of speed. Moreover, without any retraining or fine-tuning, the performance of our method with no-per-scene optimization is even better than the methods with per-scene optimization.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"1574-1587\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10817634/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10817634/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

由于密集采样光场(LF)图像具有广泛的应用前景,因此LF重建技术成为相关领域的重要技术。近年来,神经渲染在重建任务中显示出巨大的潜力。然而,现有的体绘制方法需要对整个相机光线或极线上的多个点进行采样,耗时长。本文针对具有规则角度采样的LF图像,提出了一种新颖的结构感知预选择神经渲染框架。我们提出在几个特定的位置进行采样,而不是在整个极线上采样,这些位置是利用在规则角度采样的LF图像中探索的颜色和固有场景结构信息来估计的。只对与目标像素密切匹配的几个点进行采样,就能快速、高质量地呈现目标像素的特征。最后,在视图维度上对特征进行融合和解码,得到最终的目标视图。实验表明,该方法在各种任务的定性和定量比较中都优于当前最先进的LF重建方法。我们的方法在速度方面也超过了大多数现有的方法。此外,在没有任何再训练或微调的情况下,我们的无场景优化方法的性能甚至比有场景优化的方法更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Structure-Aware Pre-Selected Neural Rendering for Light Field Reconstruction
As densely-sampled Light Field (LF) images are beneficial to many applications, LF reconstruction becomes an important technology in related fields. Recently, neural rendering shows great potential in reconstruction tasks. However, volume rendering in existing methods needs to sample many points on the whole camera ray or epipolar line, which is time-consuming. In this paper, specifically for LF images with regular angular sampling, we propose a novel Structure-Aware Pre-Selected neural rendering framework for LF reconstruction. Instead of sampling on the whole epipolar line, we propose to sample on several specific positions, which are estimated using the color and inherent scene structure information explored in the regular angular sampled LF images. Sampling only a few points that closely match the target pixel, the feature of the target pixel is quickly rendered with high-quality. Finally, we fuse the features and decode them in the view dimension to obtain the final target view. Experiments show that the proposed method outperforms the state-of-the-art LF reconstruction methods in both qualitative and quantitative comparisons across various tasks. Our method also surpasses the most existing methods in terms of speed. Moreover, without any retraining or fine-tuning, the performance of our method with no-per-scene optimization is even better than the methods with per-scene optimization.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Multimedia
IEEE Transactions on Multimedia 工程技术-电信学
CiteScore
11.70
自引率
11.00%
发文量
576
审稿时长
5.5 months
期刊介绍: The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信