Yujie Fang, Junfan Wang, Yi Chen, Mingyu Gao, Hongtao Zhou, Yaonong Wang
{"title":"An Attention Based Network for Two-dimensional Hand Pose Estimation","authors":"Yujie Fang, Junfan Wang, Yi Chen, Mingyu Gao, Hongtao Zhou, Yaonong Wang","doi":"10.1145/3577117.3577133","DOIUrl":null,"url":null,"abstract":"Accurate visual hand pose estimation at the joint level has been used in vision-based Human-Computer interaction (HCI) applications in a number of areas. However, current 2D hand pose estimation tends to focus on high accuracy prediction or fast speed prediction, which does not allow detectors to achieve both fast and accurate pose estimation. In this paper, we combine RepVGG with a self-attention mechanism proposed an improved network we called ARepNet. ArepNet doubled the speed of the network model by re-parameterized network and capturing long-range dependencies by connecting information from different places, thereby achieving an accuracy rate of 86.8%. We add a 2D hand pose dataset in low-light contexts and propose a simple contrast enhancement method to make 2D hand pose estimation robust to picture input in different environments. We have successfully deployed ARepNet to embedded devices, which FPS with 139 frames per second, meeting real-time requirements.","PeriodicalId":309874,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Image Processing","volume":"40 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Advances in Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3577117.3577133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate visual hand pose estimation at the joint level has been used in vision-based Human-Computer interaction (HCI) applications in a number of areas. However, current 2D hand pose estimation tends to focus on high accuracy prediction or fast speed prediction, which does not allow detectors to achieve both fast and accurate pose estimation. In this paper, we combine RepVGG with a self-attention mechanism proposed an improved network we called ARepNet. ArepNet doubled the speed of the network model by re-parameterized network and capturing long-range dependencies by connecting information from different places, thereby achieving an accuracy rate of 86.8%. We add a 2D hand pose dataset in low-light contexts and propose a simple contrast enhancement method to make 2D hand pose estimation robust to picture input in different environments. We have successfully deployed ARepNet to embedded devices, which FPS with 139 frames per second, meeting real-time requirements.