Muti-scale feature refined network for human pose estimation

IF 1.1 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Pattern Recognition and Artificial Intelligence Pub Date : 2023-11-09 DOI:10.1142/s0218001423560220

Qiaoning Yang, Xiaodong Ji, Xiuhui Yang

{"title":"Muti-scale feature refined network for human pose estimation","authors":"Qiaoning Yang, Xiaodong Ji, Xiuhui Yang","doi":"10.1142/s0218001423560220","DOIUrl":null,"url":null,"abstract":"Occlusive keypoints has been a challenge for human pose estimation, especially the mutual occlusion of human bodies. One possible solution to this problem is to utilize multi-scale features, where small scale features are capable of identifying keypoints, while large-scale features can capture the relationship between keypoints. Feature fusion among multi-scale features allows for the exchange of information between keypoints, facilitating the inference of occluded keypoints based on the identified keypoints. However, it’s found that there are invalid features in feature fusion which will interfere valid feature. In this paper, we propose multi-scale feature refined network (MSFRNet) based on HRNet and a new attention module namely multi-resolution attention module (MRAM). The proposed MRAM is designed to strengthen the effective information while suppressing redundant information. It has multiple inputs and outputs and can learn the relationships between keypoints while retaining detailed information. The proposed MSFRNet outperforms HRNet, achieving a 1.4[Formula: see text]AP improvement on the COCO dataset with only a marginal computational increase of 0.35 GFLOPs. Additionally, it demonstrates superior performance with a 0.9[Formula: see text]AP, 0.7[Formula: see text]AP, and 1.8[Formula: see text]AP improvement on the MPII, CrowdPose and OCHuman datasets, respectively. Furthermore, compared with the latest attention mechanism PSA, the MSFRNet exhibits lower computational cost while maintaining the same pose-estimation accuracy.","PeriodicalId":54949,"journal":{"name":"International Journal of Pattern Recognition and Artificial Intelligence","volume":" 28","pages":"0"},"PeriodicalIF":1.1000,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pattern Recognition and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0218001423560220","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Occlusive keypoints has been a challenge for human pose estimation, especially the mutual occlusion of human bodies. One possible solution to this problem is to utilize multi-scale features, where small scale features are capable of identifying keypoints, while large-scale features can capture the relationship between keypoints. Feature fusion among multi-scale features allows for the exchange of information between keypoints, facilitating the inference of occluded keypoints based on the identified keypoints. However, it’s found that there are invalid features in feature fusion which will interfere valid feature. In this paper, we propose multi-scale feature refined network (MSFRNet) based on HRNet and a new attention module namely multi-resolution attention module (MRAM). The proposed MRAM is designed to strengthen the effective information while suppressing redundant information. It has multiple inputs and outputs and can learn the relationships between keypoints while retaining detailed information. The proposed MSFRNet outperforms HRNet, achieving a 1.4[Formula: see text]AP improvement on the COCO dataset with only a marginal computational increase of 0.35 GFLOPs. Additionally, it demonstrates superior performance with a 0.9[Formula: see text]AP, 0.7[Formula: see text]AP, and 1.8[Formula: see text]AP improvement on the MPII, CrowdPose and OCHuman datasets, respectively. Furthermore, compared with the latest attention mechanism PSA, the MSFRNet exhibits lower computational cost while maintaining the same pose-estimation accuracy.

查看原文本刊更多论文

人体姿态估计的多尺度特征细化网络

关键点遮挡一直是人体姿态估计的难题，尤其是人体的相互遮挡。一种可能的解决方案是利用多尺度特征，其中小规模特征能够识别关键点，而大规模特征可以捕捉关键点之间的关系。多尺度特征之间的特征融合允许关键点之间的信息交换，便于基于识别的关键点推断被遮挡的关键点。然而，在特征融合过程中发现存在无效特征，会干扰有效特征。本文提出了基于HRNet的多尺度特征细化网络(MSFRNet)和一种新的关注模块——多分辨率关注模块(MRAM)。该算法旨在增强有效信息，同时抑制冗余信息。它具有多个输入和输出，可以在保留详细信息的同时学习关键点之间的关系。提出的MSFRNet优于HRNet，在COCO数据集上实现了1.4[公式:见文本]AP的改进，而计算量仅增加了0.35 GFLOPs。此外，在MPII、CrowdPose和ochhuman数据集上，它的AP分别提高了0.9[公式:见文]、0.7[公式:见文]和1.8[公式:见文]，表现出了卓越的性能。此外，与最新的注意机制PSA相比，MSFRNet在保持相同姿态估计精度的情况下具有更低的计算成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Pattern Recognition and Artificial Intelligence 工程技术-计算机：人工智能

CiteScore

2.90

自引率

13.30%

发文量

201

审稿时长

15.8 months

期刊介绍： The International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) welcomes both theory-oriented and innovative applications articles on new developments and is of interest to both researchers in academia and industry. The current scope of this journal includes: • Pattern Recognition • Machine Learning • Deep Learning • Document Analysis • Image Processing • Signal Processing • Computer Vision • Biometrics • Biomedical Image Analysis • Artificial Intelligence In addition to regular papers describing original research work, survey articles on timely and important research topics are highly welcome. Special issues with focused topics within the scope of this journal are also published.