高效的高分辨率人体姿态估计网络

2022 International Workshop on Intelligent Systems (IWIS) Pub Date : 2022-08-17 DOI:10.1109/IWIS56333.2022.9920796

T. Tran, Xuan-Thuy Vo, Duy-Linh Nguyen, K. Jo

{"title":"高效的高分辨率人体姿态估计网络","authors":"T. Tran, Xuan-Thuy Vo, Duy-Linh Nguyen, K. Jo","doi":"10.1109/IWIS56333.2022.9920796","DOIUrl":null,"url":null,"abstract":"Convolution neural networks (CNNs) have achieved the best performance nowadays not just for 2D or 3D pose estimation but also for many machine vision applications (e.g., image classification, semantic segmentation, object detection and so on). Beside, The Attention Module also show their leader for improve the accuracy in neural network. Hence, the proposed research is focus on creating a suitable feed-forward AM for CNNs which can save the computational cost also improve the accuracy. First, input the tensor into the attention mechanism, which is divided into two main part: channel attention module and spatial attention module. After that, the tensor passing through a stage in the backbone network. The main mechanism then multiplies these two feature maps and sends them to the next stage of backbone. The network enhance the data in terms of long-distance dependencies (channels) and geographic data. Our proposed research would also reveal a distinction between the use of the attention mechanism and nowadays approaches. The proposed research got better result when compare with the baseline-HRNet by 1.3 points in terms of AP but maintain the number of parameter not change much. Our architecture was trained on the COCO 2017 dataset, which are now available as an open benchmark.","PeriodicalId":340399,"journal":{"name":"2022 International Workshop on Intelligent Systems (IWIS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient High-Resolution Network for Human Pose Estimation\",\"authors\":\"T. Tran, Xuan-Thuy Vo, Duy-Linh Nguyen, K. Jo\",\"doi\":\"10.1109/IWIS56333.2022.9920796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolution neural networks (CNNs) have achieved the best performance nowadays not just for 2D or 3D pose estimation but also for many machine vision applications (e.g., image classification, semantic segmentation, object detection and so on). Beside, The Attention Module also show their leader for improve the accuracy in neural network. Hence, the proposed research is focus on creating a suitable feed-forward AM for CNNs which can save the computational cost also improve the accuracy. First, input the tensor into the attention mechanism, which is divided into two main part: channel attention module and spatial attention module. After that, the tensor passing through a stage in the backbone network. The main mechanism then multiplies these two feature maps and sends them to the next stage of backbone. The network enhance the data in terms of long-distance dependencies (channels) and geographic data. Our proposed research would also reveal a distinction between the use of the attention mechanism and nowadays approaches. The proposed research got better result when compare with the baseline-HRNet by 1.3 points in terms of AP but maintain the number of parameter not change much. Our architecture was trained on the COCO 2017 dataset, which are now available as an open benchmark.\",\"PeriodicalId\":340399,\"journal\":{\"name\":\"2022 International Workshop on Intelligent Systems (IWIS)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Workshop on Intelligent Systems (IWIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWIS56333.2022.9920796\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Workshop on Intelligent Systems (IWIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWIS56333.2022.9920796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

卷积神经网络(cnn)现在不仅在2D或3D姿态估计方面取得了最好的性能，而且在许多机器视觉应用中(例如，图像分类，语义分割，目标检测等)也取得了最好的性能。此外，注意力模块在提高神经网络的准确率方面也表现出领先地位。因此，本文的研究重点是为cnn创建一种合适的前馈调幅，既可以节省计算成本，又可以提高精度。首先，将张量输入到注意机制中，注意机制主要分为两个部分:通道注意模块和空间注意模块。之后，张量在骨干网络中经过一个阶段。然后，主机制将这两个特征映射相乘，并将它们发送到骨干的下一阶段。网络增强了数据的远距离依赖关系(通道)和地理数据。我们提出的研究还将揭示注意机制的使用与现在的方法之间的区别。与基线- hrnet相比，本研究的AP值提高了1.3个点，但保持参数数量变化不大。我们的架构是在COCO 2017数据集上进行训练的，该数据集现在可以作为开放基准使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient High-Resolution Network for Human Pose Estimation

Convolution neural networks (CNNs) have achieved the best performance nowadays not just for 2D or 3D pose estimation but also for many machine vision applications (e.g., image classification, semantic segmentation, object detection and so on). Beside, The Attention Module also show their leader for improve the accuracy in neural network. Hence, the proposed research is focus on creating a suitable feed-forward AM for CNNs which can save the computational cost also improve the accuracy. First, input the tensor into the attention mechanism, which is divided into two main part: channel attention module and spatial attention module. After that, the tensor passing through a stage in the backbone network. The main mechanism then multiplies these two feature maps and sends them to the next stage of backbone. The network enhance the data in terms of long-distance dependencies (channels) and geographic data. Our proposed research would also reveal a distinction between the use of the attention mechanism and nowadays approaches. The proposed research got better result when compare with the baseline-HRNet by 1.3 points in terms of AP but maintain the number of parameter not change much. Our architecture was trained on the COCO 2017 dataset, which are now available as an open benchmark.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 International Workshop on Intelligent Systems (IWIS)

自引率

0.00%

发文量