Revising Representation and Target Deviations for Accurate Human Pose Estimation.

IF 10.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zian Zhang,Yongqiang Zhang,Yancheng Bai,Man Zhang,Rui Tian,Yin Zhang,Mingli Ding,Wangmeng Zuo
{"title":"Revising Representation and Target Deviations for Accurate Human Pose Estimation.","authors":"Zian Zhang,Yongqiang Zhang,Yancheng Bai,Man Zhang,Rui Tian,Yin Zhang,Mingli Ding,Wangmeng Zuo","doi":"10.1109/tnnls.2025.3569464","DOIUrl":null,"url":null,"abstract":"Owing to the normalized instance scales and robust supervision, heatmap-based human pose estimation (HPE) methods with top-down paradigm have achieved a dominant performance. However, there are two inherent deviations in the basic framework, i.e., representation and target deviations, resulting in performance bottlenecks. The representation deviation is caused by transforming various scales of instances into a unified input size, which results in performance degradation because data with different scale-related characteristics can hardly be handled via unified parameters. The target deviation is caused by exploiting a prior distribution (e.g., Gauss) to model the prediction error, which hinders sufficient network training. In this article, we propose a novel framework called DRPose to revise the abovementioned deviations. Specifically, to address the representation deviation, a scale-aware domain bridging (SDB) block is proposed to transfer feature maps from multiple scale-dependent domains into a unified intermediate domain with dynamic parameters. To address the target deviation, a differentiable coordinate decoder (DCD) is presented to adaptively adjust target distribution of heatmaps in an end-to-end manner. Extensive experiments show that the proposed method significantly improves the performance of most existing models with negligible additional cost. Beyond this, our method achieves 77.1% AP on the COCO test-dev set, outperforming prior works with similar model complexity.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"4 1","pages":""},"PeriodicalIF":10.2000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3569464","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Owing to the normalized instance scales and robust supervision, heatmap-based human pose estimation (HPE) methods with top-down paradigm have achieved a dominant performance. However, there are two inherent deviations in the basic framework, i.e., representation and target deviations, resulting in performance bottlenecks. The representation deviation is caused by transforming various scales of instances into a unified input size, which results in performance degradation because data with different scale-related characteristics can hardly be handled via unified parameters. The target deviation is caused by exploiting a prior distribution (e.g., Gauss) to model the prediction error, which hinders sufficient network training. In this article, we propose a novel framework called DRPose to revise the abovementioned deviations. Specifically, to address the representation deviation, a scale-aware domain bridging (SDB) block is proposed to transfer feature maps from multiple scale-dependent domains into a unified intermediate domain with dynamic parameters. To address the target deviation, a differentiable coordinate decoder (DCD) is presented to adaptively adjust target distribution of heatmaps in an end-to-end manner. Extensive experiments show that the proposed method significantly improves the performance of most existing models with negligible additional cost. Beyond this, our method achieves 77.1% AP on the COCO test-dev set, outperforming prior works with similar model complexity.
修正人体姿态估计的表示和目标偏差。
由于归一化的实例尺度和鲁棒性监督,基于热图的自顶向下范式人体姿态估计(HPE)方法取得了优势。然而,在基本框架中存在两种固有的偏差,即表示和目标偏差,从而导致性能瓶颈。由于将不同规模的实例转换为统一的输入大小,导致了表示偏差,难以通过统一的参数处理具有不同规模相关特征的数据,从而导致性能下降。目标偏差是通过利用先验分布(例如高斯)来建模预测误差引起的,这阻碍了充分的网络训练。在本文中,我们提出了一个新的框架,称为DRPose来修正上述偏差。具体来说,为了解决表示偏差问题,提出了一个尺度感知域桥接(SDB)块,将多个尺度依赖域的特征映射转移到一个具有动态参数的统一中间域。为了解决目标偏差问题,提出了一种可微坐标解码器(DCD),以端到端方式自适应调整热图的目标分布。大量的实验表明,该方法可以显著提高大多数现有模型的性能,而额外的成本可以忽略不计。除此之外,我们的方法在COCO测试开发集上实现了77.1%的AP,优于具有相似模型复杂性的先前工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE transactions on neural networks and learning systems
IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
CiteScore
23.80
自引率
9.60%
发文量
2102
审稿时长
3-8 weeks
期刊介绍: The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信