Daiwei Yu, Jun Zhang, Zhao Jin, Guanqun Li, Wenjin Zhang
{"title":"Multi-task model for human pose estimation and person detection","authors":"Daiwei Yu, Jun Zhang, Zhao Jin, Guanqun Li, Wenjin Zhang","doi":"10.1117/12.2644479","DOIUrl":null,"url":null,"abstract":"Human pose estimation and person detection are two fundamental tasks of human behavior analysis. There has been remarkable progress in these two tasks separately since the development of convolutional neural network. Recently, researchers have paid more attention to one-stage human pose estimation and person detection for the needs of practical application. However, few researches have been reported on completing these two tasks in a single network simultaneously. There are two main reasons: (1) designing an effective mechanism that makes full use of their relevance and complementation to achieve common progress, especially the pose estimation accuracy is really challenging, (2) evaluation bias caused by scale sensitivity difference remains unsolved. To address these problems, we propose a multi-task model for human pose estimation and person detection simultaneously, named PersonPD (person pose and person detection). It predicts keypoint heatmaps and regresses a 4D relative displacement vector (l,t,r,b) which actually encodes the person bounding box and also acts as keypoints' grouping clues. A maximum IOU matching algorithm, named IOU-grouping, is presented to group body joints into individual persons. At the same time, it generates accurate person detection results. In this simple but effective method, our model get competitive person detection and pose estimation performance on COCO datasets1.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2644479","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Human pose estimation and person detection are two fundamental tasks of human behavior analysis. There has been remarkable progress in these two tasks separately since the development of convolutional neural network. Recently, researchers have paid more attention to one-stage human pose estimation and person detection for the needs of practical application. However, few researches have been reported on completing these two tasks in a single network simultaneously. There are two main reasons: (1) designing an effective mechanism that makes full use of their relevance and complementation to achieve common progress, especially the pose estimation accuracy is really challenging, (2) evaluation bias caused by scale sensitivity difference remains unsolved. To address these problems, we propose a multi-task model for human pose estimation and person detection simultaneously, named PersonPD (person pose and person detection). It predicts keypoint heatmaps and regresses a 4D relative displacement vector (l,t,r,b) which actually encodes the person bounding box and also acts as keypoints' grouping clues. A maximum IOU matching algorithm, named IOU-grouping, is presented to group body joints into individual persons. At the same time, it generates accurate person detection results. In this simple but effective method, our model get competitive person detection and pose estimation performance on COCO datasets1.
人体姿态估计和人体检测是人体行为分析的两个基本任务。自卷积神经网络发展以来,这两项任务分别取得了显著进展。近年来,由于实际应用的需要,研究人员越来越关注单阶段人体姿态估计和人体检测。然而,在单个网络中同时完成这两项任务的研究很少。主要有两个原因:(1)设计一种有效的机制,充分利用它们的相关性和互补性来实现共同的进步,特别是姿态估计的精度是非常具有挑战性的;(2)尺度敏感性差异引起的评价偏差仍然没有得到解决。为了解决这些问题,我们提出了一个同时用于人体姿态估计和人检测的多任务模型,命名为PersonPD (person pose and person detection)。它预测关键点热图,并回归一个4D相对位移向量(1,t,r,b),这实际上编码了人员边界框,也作为关键点的分组线索。提出了一种最大IOU匹配算法——IOU分组算法,将人体关节分组为个体。同时,生成准确的人物检测结果。在这种简单而有效的方法中,我们的模型在COCO数据集上获得了具有竞争力的人物检测和姿态估计性能。