Advances in vision-based deep learning methods for interacting hands reconstruction: A survey

IF 2.8 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk Pub Date : 2024-10-05 DOI:10.1016/j.cag.2024.104102

Yu Miao, Yue Liu

{"title":"Advances in vision-based deep learning methods for interacting hands reconstruction: A survey","authors":"Yu Miao, Yue Liu","doi":"10.1016/j.cag.2024.104102","DOIUrl":null,"url":null,"abstract":"<div><div>Vision-based hand reconstructions have become noteworthy tools in enhancing interactive experiences in various applications such as virtual reality, augmented reality, and autonomous driving, which enable sophisticated interactions by reconstructing complex motions of human hands. Despite significant progress driven by deep-learning methodologies, the quest for high-fidelity interacting hands reconstruction faces challenges such as limited dataset diversity, lack of detailed hand representation, occlusions, and differentiation between similar hand structures. This survey thoroughly reviews deep learning-based methods, diverse datasets, loss functions, and evaluation metrics addressing the complexities of interacting hands reconstruction. Mainstream algorithms of the past five years are systematically classified into two main categories: algorithms that employ explicit representations, such as parametric meshes and 3D Gaussian splatting, and those that utilize implicit representations, including signed distance fields and neural radiance fields. Novel deep-learning models like graph convolutional networks and transformers are applied to solve the aforementioned challenges in hand reconstruction effectively. Beyond summarizing these interaction-aware algorithms, this survey also briefly discusses hand tracking in virtual reality and augmented reality. To the best of our knowledge, this is the first survey specifically focusing on the reconstruction of both hands and their interactions with objects. The survey contains the various facets of hand modeling, deep learning approaches, and datasets, broadening the horizon of hand reconstruction research and future innovation in natural user interactions.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104102"},"PeriodicalIF":2.8000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849324002371","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Vision-based hand reconstructions have become noteworthy tools in enhancing interactive experiences in various applications such as virtual reality, augmented reality, and autonomous driving, which enable sophisticated interactions by reconstructing complex motions of human hands. Despite significant progress driven by deep-learning methodologies, the quest for high-fidelity interacting hands reconstruction faces challenges such as limited dataset diversity, lack of detailed hand representation, occlusions, and differentiation between similar hand structures. This survey thoroughly reviews deep learning-based methods, diverse datasets, loss functions, and evaluation metrics addressing the complexities of interacting hands reconstruction. Mainstream algorithms of the past five years are systematically classified into two main categories: algorithms that employ explicit representations, such as parametric meshes and 3D Gaussian splatting, and those that utilize implicit representations, including signed distance fields and neural radiance fields. Novel deep-learning models like graph convolutional networks and transformers are applied to solve the aforementioned challenges in hand reconstruction effectively. Beyond summarizing these interaction-aware algorithms, this survey also briefly discusses hand tracking in virtual reality and augmented reality. To the best of our knowledge, this is the first survey specifically focusing on the reconstruction of both hands and their interactions with objects. The survey contains the various facets of hand modeling, deep learning approaches, and datasets, broadening the horizon of hand reconstruction research and future innovation in natural user interactions.

查看原文本刊更多论文

基于视觉的深度学习方法在交互式手部重建方面的进展：调查

在虚拟现实、增强现实和自动驾驶等各种应用中，基于视觉的手部重建已成为增强交互体验的重要工具，这些应用通过重建人手的复杂动作实现了复杂的交互。尽管在深度学习方法的推动下取得了重大进展，但高保真交互手部重建的探索仍面临挑战，如数据集多样性有限、缺乏详细的手部表示、遮挡以及相似手部结构之间的区分。本调查全面回顾了基于深度学习的方法、各种数据集、损失函数和评估指标，以解决交互式手部重建的复杂性问题。过去五年的主流算法被系统地分为两大类：一类是采用显式表示的算法，如参数网格和三维高斯拼接；另一类是采用隐式表示的算法，包括符号距离场和神经辐射场。图卷积网络和变换器等新型深度学习模型被用于有效解决上述手部重建难题。除了总结这些交互感知算法外，本调查还简要讨论了虚拟现实和增强现实中的手部跟踪。据我们所知，这是第一份专门针对双手重建及其与物体交互的调查报告。该调查包含了手部建模、深度学习方法和数据集的各个方面，拓宽了手部重建研究和未来自然用户交互创新的视野。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.