面向光场重建的结构感知预选择神经渲染

IF 8.4 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2024-12-27 DOI:10.1109/TMM.2024.3521784

Song Chang;Youfang Lin;Shuo Zhang

{"title":"面向光场重建的结构感知预选择神经渲染","authors":"Song Chang;Youfang Lin;Shuo Zhang","doi":"10.1109/TMM.2024.3521784","DOIUrl":null,"url":null,"abstract":"As densely-sampled Light Field (LF) images are beneficial to many applications, LF reconstruction becomes an important technology in related fields. Recently, neural rendering shows great potential in reconstruction tasks. However, volume rendering in existing methods needs to sample many points on the whole camera ray or epipolar line, which is time-consuming. In this paper, specifically for LF images with regular angular sampling, we propose a novel Structure-Aware Pre-Selected neural rendering framework for LF reconstruction. Instead of sampling on the whole epipolar line, we propose to sample on several specific positions, which are estimated using the color and inherent scene structure information explored in the regular angular sampled LF images. Sampling only a few points that closely match the target pixel, the feature of the target pixel is quickly rendered with high-quality. Finally, we fuse the features and decode them in the view dimension to obtain the final target view. Experiments show that the proposed method outperforms the state-of-the-art LF reconstruction methods in both qualitative and quantitative comparisons across various tasks. Our method also surpasses the most existing methods in terms of speed. Moreover, without any retraining or fine-tuning, the performance of our method with no-per-scene optimization is even better than the methods with per-scene optimization.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"1574-1587"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Structure-Aware Pre-Selected Neural Rendering for Light Field Reconstruction\",\"authors\":\"Song Chang;Youfang Lin;Shuo Zhang\",\"doi\":\"10.1109/TMM.2024.3521784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As densely-sampled Light Field (LF) images are beneficial to many applications, LF reconstruction becomes an important technology in related fields. Recently, neural rendering shows great potential in reconstruction tasks. However, volume rendering in existing methods needs to sample many points on the whole camera ray or epipolar line, which is time-consuming. In this paper, specifically for LF images with regular angular sampling, we propose a novel Structure-Aware Pre-Selected neural rendering framework for LF reconstruction. Instead of sampling on the whole epipolar line, we propose to sample on several specific positions, which are estimated using the color and inherent scene structure information explored in the regular angular sampled LF images. Sampling only a few points that closely match the target pixel, the feature of the target pixel is quickly rendered with high-quality. Finally, we fuse the features and decode them in the view dimension to obtain the final target view. Experiments show that the proposed method outperforms the state-of-the-art LF reconstruction methods in both qualitative and quantitative comparisons across various tasks. Our method also surpasses the most existing methods in terms of speed. Moreover, without any retraining or fine-tuning, the performance of our method with no-per-scene optimization is even better than the methods with per-scene optimization.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"1574-1587\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10817634/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10817634/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

由于密集采样光场（LF）图像具有广泛的应用前景，因此LF重建技术成为相关领域的重要技术。近年来，神经渲染在重建任务中显示出巨大的潜力。然而，现有的体绘制方法需要对整个相机光线或极线上的多个点进行采样，耗时长。本文针对具有规则角度采样的LF图像，提出了一种新颖的结构感知预选择神经渲染框架。我们提出在几个特定的位置进行采样，而不是在整个极线上采样，这些位置是利用在规则角度采样的LF图像中探索的颜色和固有场景结构信息来估计的。只对与目标像素密切匹配的几个点进行采样，就能快速、高质量地呈现目标像素的特征。最后，在视图维度上对特征进行融合和解码，得到最终的目标视图。实验表明，该方法在各种任务的定性和定量比较中都优于当前最先进的LF重建方法。我们的方法在速度方面也超过了大多数现有的方法。此外，在没有任何再训练或微调的情况下，我们的无场景优化方法的性能甚至比有场景优化的方法更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Structure-Aware Pre-Selected Neural Rendering for Light Field Reconstruction

As densely-sampled Light Field (LF) images are beneficial to many applications, LF reconstruction becomes an important technology in related fields. Recently, neural rendering shows great potential in reconstruction tasks. However, volume rendering in existing methods needs to sample many points on the whole camera ray or epipolar line, which is time-consuming. In this paper, specifically for LF images with regular angular sampling, we propose a novel Structure-Aware Pre-Selected neural rendering framework for LF reconstruction. Instead of sampling on the whole epipolar line, we propose to sample on several specific positions, which are estimated using the color and inherent scene structure information explored in the regular angular sampled LF images. Sampling only a few points that closely match the target pixel, the feature of the target pixel is quickly rendered with high-quality. Finally, we fuse the features and decode them in the view dimension to obtain the final target view. Experiments show that the proposed method outperforms the state-of-the-art LF reconstruction methods in both qualitative and quantitative comparisons across various tasks. Our method also surpasses the most existing methods in terms of speed. Moreover, without any retraining or fine-tuning, the performance of our method with no-per-scene optimization is even better than the methods with per-scene optimization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.