Zean Chen;Yeyao Chen;Gangyi Jiang;Mei Yu;Haiyong Xu;Ting Luo
{"title":"Multi-Scale Spatial-Angular Collaborative Guidance Network for Heterogeneous Light Field Spatial Super-Resolution","authors":"Zean Chen;Yeyao Chen;Gangyi Jiang;Mei Yu;Haiyong Xu;Ting Luo","doi":"10.1109/TBC.2024.3420748","DOIUrl":null,"url":null,"abstract":"Light Field (LF) imaging captures the spatial and angular information of light rays in the real world and enables various applications, including digital refocusing and single-shot depth estimation. Unfortunately, due to the limited sensor size of LF cameras, the captured LF images suffer from low spatial resolution while providing a dense angular sampling. Existing single-input LF spatial super-resolution (SR) methods usually utilize the inherent sub-pixel information to recover high-frequency textures, but they struggle in large-scale SR tasks (e.g., \n<inline-formula> <tex-math>$8\\times $ </tex-math></inline-formula>\n). Conversely, the heterogeneous imaging approach combining an LF camera and a 2D digital camera can capture richer information for effective large-scale reconstruction. To this end, this paper proposes a multi-scale spatial-angular collaborative guidance network (LF-MSACGNet) for heterogeneous LF spatial SR. Specifically, a context-guided deformable alignment module is first designed, which utilizes high-level feature information to achieve precise alignment between the low-resolution LF image and the 2D high-resolution image. Subsequently, a Transformer-driven spatial-angular collaborative guidance module is constructed to explore the spatial-angular correlation and complementarity. This allows for an effective fusion of the multi-resolution spatial-angular features. Finally, the SR LF image is reconstructed through a spatial-angular aggregation module. In addition, a multi-scale training strategy is adopted to subdivide the challenging large-scale SR task into multiple simple tasks to boost the SR performance. Experimental results on seven public datasets show that the proposed method outperforms the state-of-the-art SR methods in both quantitative and qualitative comparison, and exhibits favorable robustness to wide baseline LF images.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1221-1235"},"PeriodicalIF":3.2000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10616115/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Light Field (LF) imaging captures the spatial and angular information of light rays in the real world and enables various applications, including digital refocusing and single-shot depth estimation. Unfortunately, due to the limited sensor size of LF cameras, the captured LF images suffer from low spatial resolution while providing a dense angular sampling. Existing single-input LF spatial super-resolution (SR) methods usually utilize the inherent sub-pixel information to recover high-frequency textures, but they struggle in large-scale SR tasks (e.g.,
$8\times $
). Conversely, the heterogeneous imaging approach combining an LF camera and a 2D digital camera can capture richer information for effective large-scale reconstruction. To this end, this paper proposes a multi-scale spatial-angular collaborative guidance network (LF-MSACGNet) for heterogeneous LF spatial SR. Specifically, a context-guided deformable alignment module is first designed, which utilizes high-level feature information to achieve precise alignment between the low-resolution LF image and the 2D high-resolution image. Subsequently, a Transformer-driven spatial-angular collaborative guidance module is constructed to explore the spatial-angular correlation and complementarity. This allows for an effective fusion of the multi-resolution spatial-angular features. Finally, the SR LF image is reconstructed through a spatial-angular aggregation module. In addition, a multi-scale training strategy is adopted to subdivide the challenging large-scale SR task into multiple simple tasks to boost the SR performance. Experimental results on seven public datasets show that the proposed method outperforms the state-of-the-art SR methods in both quantitative and qualitative comparison, and exhibits favorable robustness to wide baseline LF images.
期刊介绍:
The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”