{"title":"基于注意力的手部姿态估计,采用投票和双模技术","authors":"Dinh-Cuong Hoang , Anh-Nhat Nguyen , Thu-Uyen Nguyen , Ngoc-Anh Hoang , Van-Duc Vu , Duy-Quang Vu , Phuc-Quan Ngo , Khanh-Toan Phan , Duc-Thanh Tran , Van-Thiep Nguyen , Quang-Tri Duong , Ngoc-Trung Ho , Cong-Trinh Tran , Van-Hiep Duong , Anh-Truong Mai","doi":"10.1016/j.engappai.2024.109526","DOIUrl":null,"url":null,"abstract":"<div><div>Hand pose estimation has recently emerged as a compelling topic in the robotic research community, because of its usefulness in learning from human demonstration or safe human–robot interaction. Although deep learning-based methods have been introduced for this task and have shown promise, it remains a challenging problem. To address this, we propose a novel end-to-end architecture for hand pose estimation using red-green-blue (RGB) and depth (D) data (RGB-D). Our approach processes the two data sources separately and utilizes a dense fusion network with an attention module to extract discriminative features. The features extracted include both spatial information and geometric constraints, which are fused to vote for the hand pose. We demonstrate that our voting mechanism in conjunction with the attention mechanism is particularly useful for solving the problem, especially when hands are heavily occluded by objects or are self-occluded. Our experimental results on benchmark datasets demonstrate that our approach outperforms state-of-the-art methods by a significant margin.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109526"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Attention-based hand pose estimation with voting and dual modalities\",\"authors\":\"Dinh-Cuong Hoang , Anh-Nhat Nguyen , Thu-Uyen Nguyen , Ngoc-Anh Hoang , Van-Duc Vu , Duy-Quang Vu , Phuc-Quan Ngo , Khanh-Toan Phan , Duc-Thanh Tran , Van-Thiep Nguyen , Quang-Tri Duong , Ngoc-Trung Ho , Cong-Trinh Tran , Van-Hiep Duong , Anh-Truong Mai\",\"doi\":\"10.1016/j.engappai.2024.109526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Hand pose estimation has recently emerged as a compelling topic in the robotic research community, because of its usefulness in learning from human demonstration or safe human–robot interaction. Although deep learning-based methods have been introduced for this task and have shown promise, it remains a challenging problem. To address this, we propose a novel end-to-end architecture for hand pose estimation using red-green-blue (RGB) and depth (D) data (RGB-D). Our approach processes the two data sources separately and utilizes a dense fusion network with an attention module to extract discriminative features. The features extracted include both spatial information and geometric constraints, which are fused to vote for the hand pose. We demonstrate that our voting mechanism in conjunction with the attention mechanism is particularly useful for solving the problem, especially when hands are heavily occluded by objects or are self-occluded. Our experimental results on benchmark datasets demonstrate that our approach outperforms state-of-the-art methods by a significant margin.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"139 \",\"pages\":\"Article 109526\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624016841\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624016841","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Attention-based hand pose estimation with voting and dual modalities
Hand pose estimation has recently emerged as a compelling topic in the robotic research community, because of its usefulness in learning from human demonstration or safe human–robot interaction. Although deep learning-based methods have been introduced for this task and have shown promise, it remains a challenging problem. To address this, we propose a novel end-to-end architecture for hand pose estimation using red-green-blue (RGB) and depth (D) data (RGB-D). Our approach processes the two data sources separately and utilizes a dense fusion network with an attention module to extract discriminative features. The features extracted include both spatial information and geometric constraints, which are fused to vote for the hand pose. We demonstrate that our voting mechanism in conjunction with the attention mechanism is particularly useful for solving the problem, especially when hands are heavily occluded by objects or are self-occluded. Our experimental results on benchmark datasets demonstrate that our approach outperforms state-of-the-art methods by a significant margin.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.