Polarization video frame interpolation for 3D human pose reconstruction with attention mechanism

IF 3.5 2区 工程技术 Q2 OPTICS
Xin Zhang , Xuesong Wang , Yixuan Xu, Xianyu Wu, Feng Huang
{"title":"Polarization video frame interpolation for 3D human pose reconstruction with attention mechanism","authors":"Xin Zhang ,&nbsp;Xuesong Wang ,&nbsp;Yixuan Xu,&nbsp;Xianyu Wu,&nbsp;Feng Huang","doi":"10.1016/j.optlaseng.2025.109046","DOIUrl":null,"url":null,"abstract":"<div><div>Video frame interpolation has been extensively explored and demonstrated, yet its application to polarization remains largely unexplored. Due to the selective transmission of light by polarized filters, longer exposure times are typically required to ensure sufficient light intensity, which consequently lower the temporal sample rates. Furthermore, because polarization reflected by objects varies with shooting perspective, focusing solely on estimating pixel displacement is insufficient to accurately reconstruct the intermediate polarization. To tackle these challenges, this study proposes a deep learning method called Swin-VFI based on the Swin-Transformer and introduces a tailored loss function to facilitate the network's understanding of polarization changes. To ensure the practicality of our proposed method, this study evaluates its interpolated frames in Shape from Polarization and Human Shape Reconstruction tasks, comparing them with other state-of-the-art methods such as CAIN, FLAVR, and VFIT. The experimental results show that our method achieves an average PSNR improvement of 0.43 dB over the suboptimal methods in all tasks.</div></div>","PeriodicalId":49719,"journal":{"name":"Optics and Lasers in Engineering","volume":"193 ","pages":"Article 109046"},"PeriodicalIF":3.5000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Lasers in Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143816625002325","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Video frame interpolation has been extensively explored and demonstrated, yet its application to polarization remains largely unexplored. Due to the selective transmission of light by polarized filters, longer exposure times are typically required to ensure sufficient light intensity, which consequently lower the temporal sample rates. Furthermore, because polarization reflected by objects varies with shooting perspective, focusing solely on estimating pixel displacement is insufficient to accurately reconstruct the intermediate polarization. To tackle these challenges, this study proposes a deep learning method called Swin-VFI based on the Swin-Transformer and introduces a tailored loss function to facilitate the network's understanding of polarization changes. To ensure the practicality of our proposed method, this study evaluates its interpolated frames in Shape from Polarization and Human Shape Reconstruction tasks, comparing them with other state-of-the-art methods such as CAIN, FLAVR, and VFIT. The experimental results show that our method achieves an average PSNR improvement of 0.43 dB over the suboptimal methods in all tasks.
基于注意机制的三维人体姿态重建偏振视频帧插值
视频帧内插已经被广泛的探索和证明,但其在偏振中的应用仍然很大程度上未被探索。由于光通过偏振滤光片的选择性传输,通常需要较长的曝光时间来确保足够的光强度,从而降低时间采样率。此外,由于物体反射的偏振随拍摄视角的不同而变化,单纯地估计像元位移不足以准确地重建中间偏振。为了应对这些挑战,本研究提出了一种基于swwin - transformer的深度学习方法swwin - vfi,并引入了定制的损失函数,以促进网络对极化变化的理解。为了确保我们提出的方法的实用性,本研究评估了偏振和人体形状重建任务的形状插值帧,并将其与其他最先进的方法(如CAIN, FLAVR和VFIT)进行了比较。实验结果表明,在所有任务中,我们的方法比次优方法的平均PSNR提高了0.43 dB。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Optics and Lasers in Engineering
Optics and Lasers in Engineering 工程技术-光学
CiteScore
8.90
自引率
8.70%
发文量
384
审稿时长
42 days
期刊介绍: Optics and Lasers in Engineering aims at providing an international forum for the interchange of information on the development of optical techniques and laser technology in engineering. Emphasis is placed on contributions targeted at the practical use of methods and devices, the development and enhancement of solutions and new theoretical concepts for experimental methods. Optics and Lasers in Engineering reflects the main areas in which optical methods are being used and developed for an engineering environment. Manuscripts should offer clear evidence of novelty and significance. Papers focusing on parameter optimization or computational issues are not suitable. Similarly, papers focussed on an application rather than the optical method fall outside the journal''s scope. The scope of the journal is defined to include the following: -Optical Metrology- Optical Methods for 3D visualization and virtual engineering- Optical Techniques for Microsystems- Imaging, Microscopy and Adaptive Optics- Computational Imaging- Laser methods in manufacturing- Integrated optical and photonic sensors- Optics and Photonics in Life Science- Hyperspectral and spectroscopic methods- Infrared and Terahertz techniques
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信