Muti-scale feature refined network for human pose estimation

IF 1.1 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Qiaoning Yang, Xiaodong Ji, Xiuhui Yang
{"title":"Muti-scale feature refined network for human pose estimation","authors":"Qiaoning Yang, Xiaodong Ji, Xiuhui Yang","doi":"10.1142/s0218001423560220","DOIUrl":null,"url":null,"abstract":"Occlusive keypoints has been a challenge for human pose estimation, especially the mutual occlusion of human bodies. One possible solution to this problem is to utilize multi-scale features, where small scale features are capable of identifying keypoints, while large-scale features can capture the relationship between keypoints. Feature fusion among multi-scale features allows for the exchange of information between keypoints, facilitating the inference of occluded keypoints based on the identified keypoints. However, it’s found that there are invalid features in feature fusion which will interfere valid feature. In this paper, we propose multi-scale feature refined network (MSFRNet) based on HRNet and a new attention module namely multi-resolution attention module (MRAM). The proposed MRAM is designed to strengthen the effective information while suppressing redundant information. It has multiple inputs and outputs and can learn the relationships between keypoints while retaining detailed information. The proposed MSFRNet outperforms HRNet, achieving a 1.4[Formula: see text]AP improvement on the COCO dataset with only a marginal computational increase of 0.35 GFLOPs. Additionally, it demonstrates superior performance with a 0.9[Formula: see text]AP, 0.7[Formula: see text]AP, and 1.8[Formula: see text]AP improvement on the MPII, CrowdPose and OCHuman datasets, respectively. Furthermore, compared with the latest attention mechanism PSA, the MSFRNet exhibits lower computational cost while maintaining the same pose-estimation accuracy.","PeriodicalId":54949,"journal":{"name":"International Journal of Pattern Recognition and Artificial Intelligence","volume":" 28","pages":"0"},"PeriodicalIF":1.1000,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pattern Recognition and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0218001423560220","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Occlusive keypoints has been a challenge for human pose estimation, especially the mutual occlusion of human bodies. One possible solution to this problem is to utilize multi-scale features, where small scale features are capable of identifying keypoints, while large-scale features can capture the relationship between keypoints. Feature fusion among multi-scale features allows for the exchange of information between keypoints, facilitating the inference of occluded keypoints based on the identified keypoints. However, it’s found that there are invalid features in feature fusion which will interfere valid feature. In this paper, we propose multi-scale feature refined network (MSFRNet) based on HRNet and a new attention module namely multi-resolution attention module (MRAM). The proposed MRAM is designed to strengthen the effective information while suppressing redundant information. It has multiple inputs and outputs and can learn the relationships between keypoints while retaining detailed information. The proposed MSFRNet outperforms HRNet, achieving a 1.4[Formula: see text]AP improvement on the COCO dataset with only a marginal computational increase of 0.35 GFLOPs. Additionally, it demonstrates superior performance with a 0.9[Formula: see text]AP, 0.7[Formula: see text]AP, and 1.8[Formula: see text]AP improvement on the MPII, CrowdPose and OCHuman datasets, respectively. Furthermore, compared with the latest attention mechanism PSA, the MSFRNet exhibits lower computational cost while maintaining the same pose-estimation accuracy.
人体姿态估计的多尺度特征细化网络
关键点遮挡一直是人体姿态估计的难题,尤其是人体的相互遮挡。一种可能的解决方案是利用多尺度特征,其中小规模特征能够识别关键点,而大规模特征可以捕捉关键点之间的关系。多尺度特征之间的特征融合允许关键点之间的信息交换,便于基于识别的关键点推断被遮挡的关键点。然而,在特征融合过程中发现存在无效特征,会干扰有效特征。本文提出了基于HRNet的多尺度特征细化网络(MSFRNet)和一种新的关注模块——多分辨率关注模块(MRAM)。该算法旨在增强有效信息,同时抑制冗余信息。它具有多个输入和输出,可以在保留详细信息的同时学习关键点之间的关系。提出的MSFRNet优于HRNet,在COCO数据集上实现了1.4[公式:见文本]AP的改进,而计算量仅增加了0.35 GFLOPs。此外,在MPII、CrowdPose和ochhuman数据集上,它的AP分别提高了0.9[公式:见文]、0.7[公式:见文]和1.8[公式:见文],表现出了卓越的性能。此外,与最新的注意机制PSA相比,MSFRNet在保持相同姿态估计精度的情况下具有更低的计算成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.90
自引率
13.30%
发文量
201
审稿时长
15.8 months
期刊介绍: The International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) welcomes both theory-oriented and innovative applications articles on new developments and is of interest to both researchers in academia and industry. The current scope of this journal includes: • Pattern Recognition • Machine Learning • Deep Learning • Document Analysis • Image Processing • Signal Processing • Computer Vision • Biometrics • Biomedical Image Analysis • Artificial Intelligence In addition to regular papers describing original research work, survey articles on timely and important research topics are highly welcome. Special issues with focused topics within the scope of this journal are also published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信