FingerPoseNet: A finger-level multitask learning network with residual feature sharing for 3D hand pose estimation

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-03-10 DOI:10.1016/j.neunet.2025.107315

Tekie Tsegay Tewolde , Ali Asghar Manjotho , Prodip Kumar Sarker , Zhendong Niu

{"title":"FingerPoseNet: A finger-level multitask learning network with residual feature sharing for 3D hand pose estimation","authors":"Tekie Tsegay Tewolde , Ali Asghar Manjotho , Prodip Kumar Sarker , Zhendong Niu","doi":"10.1016/j.neunet.2025.107315","DOIUrl":null,"url":null,"abstract":"<div><div>Hand pose estimation approaches commonly rely on shared hand feature maps to regress the 3D locations of all hand joints. Subsequently, they struggle to enhance finger-level features which are invaluable in capturing joint-to-finger associations and articulations. To address this limitation, we propose a finger-level multitask learning network with residual feature sharing, named FingerPoseNet, for accurate 3D hand pose estimation from a depth image. FingerPoseNet comprises three stages: (a) a shared base feature map extraction backbone based on pre-trained ResNet-50; (b) a finger-level multitask learning stage that extracts and enhances feature maps for each finger and the palm; and (c) a multitask fusion layer for consolidating the estimation results obtained by each subtask. We exploit multitask learning by decoupling the hand pose estimation task into six subtasks dedicated to each finger and palm. Each subtask is responsible for subtask-specific feature extraction, enhancement, and 3D keypoint regression. To enhance subtask-specific features, we propose a residual feature-sharing approach scaled up to mine supplementary information from all subtasks. Experiments performed on five challenging public hand pose datasets, including ICVL, NYU, MSRA, Hands-2019-Task1, and HO3D-v3 demonstrate significant improvements in accuracy compared with state-of-the-art approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107315"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025001947","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Hand pose estimation approaches commonly rely on shared hand feature maps to regress the 3D locations of all hand joints. Subsequently, they struggle to enhance finger-level features which are invaluable in capturing joint-to-finger associations and articulations. To address this limitation, we propose a finger-level multitask learning network with residual feature sharing, named FingerPoseNet, for accurate 3D hand pose estimation from a depth image. FingerPoseNet comprises three stages: (a) a shared base feature map extraction backbone based on pre-trained ResNet-50; (b) a finger-level multitask learning stage that extracts and enhances feature maps for each finger and the palm; and (c) a multitask fusion layer for consolidating the estimation results obtained by each subtask. We exploit multitask learning by decoupling the hand pose estimation task into six subtasks dedicated to each finger and palm. Each subtask is responsible for subtask-specific feature extraction, enhancement, and 3D keypoint regression. To enhance subtask-specific features, we propose a residual feature-sharing approach scaled up to mine supplementary information from all subtasks. Experiments performed on five challenging public hand pose datasets, including ICVL, NYU, MSRA, Hands-2019-Task1, and HO3D-v3 demonstrate significant improvements in accuracy compared with state-of-the-art approaches.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.