{"title":"Muti-scale feature refined network for human pose estimation","authors":"Qiaoning Yang, Xiaodong Ji, Xiuhui Yang","doi":"10.1142/s0218001423560220","DOIUrl":null,"url":null,"abstract":"Occlusive keypoints has been a challenge for human pose estimation, especially the mutual occlusion of human bodies. One possible solution to this problem is to utilize multi-scale features, where small scale features are capable of identifying keypoints, while large-scale features can capture the relationship between keypoints. Feature fusion among multi-scale features allows for the exchange of information between keypoints, facilitating the inference of occluded keypoints based on the identified keypoints. However, it’s found that there are invalid features in feature fusion which will interfere valid feature. In this paper, we propose multi-scale feature refined network (MSFRNet) based on HRNet and a new attention module namely multi-resolution attention module (MRAM). The proposed MRAM is designed to strengthen the effective information while suppressing redundant information. It has multiple inputs and outputs and can learn the relationships between keypoints while retaining detailed information. The proposed MSFRNet outperforms HRNet, achieving a 1.4[Formula: see text]AP improvement on the COCO dataset with only a marginal computational increase of 0.35 GFLOPs. Additionally, it demonstrates superior performance with a 0.9[Formula: see text]AP, 0.7[Formula: see text]AP, and 1.8[Formula: see text]AP improvement on the MPII, CrowdPose and OCHuman datasets, respectively. Furthermore, compared with the latest attention mechanism PSA, the MSFRNet exhibits lower computational cost while maintaining the same pose-estimation accuracy.","PeriodicalId":54949,"journal":{"name":"International Journal of Pattern Recognition and Artificial Intelligence","volume":" 28","pages":"0"},"PeriodicalIF":1.1000,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pattern Recognition and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0218001423560220","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Occlusive keypoints has been a challenge for human pose estimation, especially the mutual occlusion of human bodies. One possible solution to this problem is to utilize multi-scale features, where small scale features are capable of identifying keypoints, while large-scale features can capture the relationship between keypoints. Feature fusion among multi-scale features allows for the exchange of information between keypoints, facilitating the inference of occluded keypoints based on the identified keypoints. However, it’s found that there are invalid features in feature fusion which will interfere valid feature. In this paper, we propose multi-scale feature refined network (MSFRNet) based on HRNet and a new attention module namely multi-resolution attention module (MRAM). The proposed MRAM is designed to strengthen the effective information while suppressing redundant information. It has multiple inputs and outputs and can learn the relationships between keypoints while retaining detailed information. The proposed MSFRNet outperforms HRNet, achieving a 1.4[Formula: see text]AP improvement on the COCO dataset with only a marginal computational increase of 0.35 GFLOPs. Additionally, it demonstrates superior performance with a 0.9[Formula: see text]AP, 0.7[Formula: see text]AP, and 1.8[Formula: see text]AP improvement on the MPII, CrowdPose and OCHuman datasets, respectively. Furthermore, compared with the latest attention mechanism PSA, the MSFRNet exhibits lower computational cost while maintaining the same pose-estimation accuracy.
期刊介绍:
The International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) welcomes both theory-oriented and innovative applications articles on new developments and is of interest to both researchers in academia and industry.
The current scope of this journal includes:
• Pattern Recognition
• Machine Learning
• Deep Learning
• Document Analysis
• Image Processing
• Signal Processing
• Computer Vision
• Biometrics
• Biomedical Image Analysis
• Artificial Intelligence
In addition to regular papers describing original research work, survey articles on timely and important research topics are highly welcome. Special issues with focused topics within the scope of this journal are also published.