通过图形-图像频率映射和图形频率分解可扩展的高保真三维手部形状重建

IF 18.6

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-25 DOI:10.1109/TPAMI.2025.3554516

Tianyu Luan;Yuanhao Zhai;Jingjing Meng;Zhong Li;Zhang Chen;Yi Xu;Junsong Yuan

{"title":"通过图形-图像频率映射和图形频率分解可扩展的高保真三维手部形状重建","authors":"Tianyu Luan;Yuanhao Zhai;Jingjing Meng;Zhong Li;Zhang Chen;Yi Xu;Junsong Yuan","doi":"10.1109/TPAMI.2025.3554516","DOIUrl":null,"url":null,"abstract":"Despite the impressive performance obtained by recent single-image hand modeling techniques, they lack the capability to capture sufficient details of the 3D hand mesh. This deficiency greatly limits their applications when high-fidelity hand modeling is required, e.g., personalized hand modeling. To address this problem, we design a frequency split network to generate 3D hand meshes using different frequency bands in a coarse-to-fine manner. To capture high-frequency personalized details, we transform the 3D mesh into the frequency domain, and proposed a novel frequency decomposition loss to supervise each frequency component. By leveraging such a coarse-to-fine scheme, hand details that correspond to the higher frequency domain can be preserved. In addition, the proposed network is scalable, and can stop the inference at any resolution level to accommodate different hardware with varying computational powers. To feed the scalable frequency network with frequency split image features, we proposed an image-graph ring feature mapping strategy. To train our network with per-vertex supervision, we use a bidirectional registration strategy to generate a topology-fixed ground-truth. To quantitatively evaluate the performance of our method in terms of recovering personalized shape details, we introduce a new evaluation metric named Mean-frequency Signal-to-Noise Ratio (MSNR) to measure the mean signal-to-noise ratio of mesh signal on each frequency component. Extensive experiments demonstrate that our approach generates fine-grained details for high-fidelity 3D hand reconstruction, and our evaluation metric is more effective than traditional metrics for measuring mesh details.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5832-5846"},"PeriodicalIF":18.6000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable High-Fidelity 3D Hand Shape Reconstruction via Graph-Image Frequency Mapping and Graph Frequency Decomposition\",\"authors\":\"Tianyu Luan;Yuanhao Zhai;Jingjing Meng;Zhong Li;Zhang Chen;Yi Xu;Junsong Yuan\",\"doi\":\"10.1109/TPAMI.2025.3554516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite the impressive performance obtained by recent single-image hand modeling techniques, they lack the capability to capture sufficient details of the 3D hand mesh. This deficiency greatly limits their applications when high-fidelity hand modeling is required, e.g., personalized hand modeling. To address this problem, we design a frequency split network to generate 3D hand meshes using different frequency bands in a coarse-to-fine manner. To capture high-frequency personalized details, we transform the 3D mesh into the frequency domain, and proposed a novel frequency decomposition loss to supervise each frequency component. By leveraging such a coarse-to-fine scheme, hand details that correspond to the higher frequency domain can be preserved. In addition, the proposed network is scalable, and can stop the inference at any resolution level to accommodate different hardware with varying computational powers. To feed the scalable frequency network with frequency split image features, we proposed an image-graph ring feature mapping strategy. To train our network with per-vertex supervision, we use a bidirectional registration strategy to generate a topology-fixed ground-truth. To quantitatively evaluate the performance of our method in terms of recovering personalized shape details, we introduce a new evaluation metric named Mean-frequency Signal-to-Noise Ratio (MSNR) to measure the mean signal-to-noise ratio of mesh signal on each frequency component. Extensive experiments demonstrate that our approach generates fine-grained details for high-fidelity 3D hand reconstruction, and our evaluation metric is more effective than traditional metrics for measuring mesh details.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 7\",\"pages\":\"5832-5846\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10938718/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10938718/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

尽管最近的单图像手建模技术获得了令人印象深刻的性能，但它们缺乏捕捉3D手网格足够细节的能力。这一缺陷极大地限制了它们在需要高保真手部建模时的应用，例如个性化手部建模。为了解决这个问题，我们设计了一个频率分割网络，以粗到细的方式使用不同的频带生成3D手部网格。为了捕获高频个性化细节，我们将三维网格转换到频域，并提出了一种新的频率分解损失来监督每个频率分量。通过利用这种从粗到精的方案，可以保留对应于较高频率域的手部细节。此外，所提出的网络是可扩展的，并且可以在任何分辨率水平上停止推理，以适应具有不同计算能力的不同硬件。为了给可扩展频率网络提供分频图像特征，提出了一种图像-图环特征映射策略。为了使用每个顶点监督来训练我们的网络，我们使用双向注册策略来生成拓扑固定的ground-truth。为了定量评估我们的方法在恢复个性化形状细节方面的性能，我们引入了一个新的评估指标，即平均频率信噪比（MSNR）来衡量网格信号在每个频率分量上的平均信噪比。大量的实验表明，我们的方法可以为高保真的3D手部重建生成细粒度的细节，并且我们的评估指标比传统的测量网格细节的指标更有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scalable High-Fidelity 3D Hand Shape Reconstruction via Graph-Image Frequency Mapping and Graph Frequency Decomposition

Despite the impressive performance obtained by recent single-image hand modeling techniques, they lack the capability to capture sufficient details of the 3D hand mesh. This deficiency greatly limits their applications when high-fidelity hand modeling is required, e.g., personalized hand modeling. To address this problem, we design a frequency split network to generate 3D hand meshes using different frequency bands in a coarse-to-fine manner. To capture high-frequency personalized details, we transform the 3D mesh into the frequency domain, and proposed a novel frequency decomposition loss to supervise each frequency component. By leveraging such a coarse-to-fine scheme, hand details that correspond to the higher frequency domain can be preserved. In addition, the proposed network is scalable, and can stop the inference at any resolution level to accommodate different hardware with varying computational powers. To feed the scalable frequency network with frequency split image features, we proposed an image-graph ring feature mapping strategy. To train our network with per-vertex supervision, we use a bidirectional registration strategy to generate a topology-fixed ground-truth. To quantitatively evaluate the performance of our method in terms of recovering personalized shape details, we introduce a new evaluation metric named Mean-frequency Signal-to-Noise Ratio (MSNR) to measure the mean signal-to-noise ratio of mesh signal on each frequency component. Extensive experiments demonstrate that our approach generates fine-grained details for high-fidelity 3D hand reconstruction, and our evaluation metric is more effective than traditional metrics for measuring mesh details.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量