{"title":"Revising Representation and Target Deviations for Accurate Human Pose Estimation.","authors":"Zian Zhang,Yongqiang Zhang,Yancheng Bai,Man Zhang,Rui Tian,Yin Zhang,Mingli Ding,Wangmeng Zuo","doi":"10.1109/tnnls.2025.3569464","DOIUrl":null,"url":null,"abstract":"Owing to the normalized instance scales and robust supervision, heatmap-based human pose estimation (HPE) methods with top-down paradigm have achieved a dominant performance. However, there are two inherent deviations in the basic framework, i.e., representation and target deviations, resulting in performance bottlenecks. The representation deviation is caused by transforming various scales of instances into a unified input size, which results in performance degradation because data with different scale-related characteristics can hardly be handled via unified parameters. The target deviation is caused by exploiting a prior distribution (e.g., Gauss) to model the prediction error, which hinders sufficient network training. In this article, we propose a novel framework called DRPose to revise the abovementioned deviations. Specifically, to address the representation deviation, a scale-aware domain bridging (SDB) block is proposed to transfer feature maps from multiple scale-dependent domains into a unified intermediate domain with dynamic parameters. To address the target deviation, a differentiable coordinate decoder (DCD) is presented to adaptively adjust target distribution of heatmaps in an end-to-end manner. Extensive experiments show that the proposed method significantly improves the performance of most existing models with negligible additional cost. Beyond this, our method achieves 77.1% AP on the COCO test-dev set, outperforming prior works with similar model complexity.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"4 1","pages":""},"PeriodicalIF":10.2000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3569464","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Owing to the normalized instance scales and robust supervision, heatmap-based human pose estimation (HPE) methods with top-down paradigm have achieved a dominant performance. However, there are two inherent deviations in the basic framework, i.e., representation and target deviations, resulting in performance bottlenecks. The representation deviation is caused by transforming various scales of instances into a unified input size, which results in performance degradation because data with different scale-related characteristics can hardly be handled via unified parameters. The target deviation is caused by exploiting a prior distribution (e.g., Gauss) to model the prediction error, which hinders sufficient network training. In this article, we propose a novel framework called DRPose to revise the abovementioned deviations. Specifically, to address the representation deviation, a scale-aware domain bridging (SDB) block is proposed to transfer feature maps from multiple scale-dependent domains into a unified intermediate domain with dynamic parameters. To address the target deviation, a differentiable coordinate decoder (DCD) is presented to adaptively adjust target distribution of heatmaps in an end-to-end manner. Extensive experiments show that the proposed method significantly improves the performance of most existing models with negligible additional cost. Beyond this, our method achieves 77.1% AP on the COCO test-dev set, outperforming prior works with similar model complexity.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.