{"title":"SLBNet:姿态估计的浅轻量级双边网络","authors":"Mao-qing Zhou, W. Sun, F. Yang, Sheng Zhang","doi":"10.1109/CISP-BMEI53629.2021.9624376","DOIUrl":null,"url":null,"abstract":"Human pose estimation from images is an important task in many real-life applications. However, most existing methods focus on improving the effectiveness without considering efficiency, making the networks computationally expensive with a huge size. As depthwise separable convolution can help compress the model size and floating point operations (FLOPs), some methods combined it to make human pose estimation affordable on resource-constrained devices. However, depthwise separable convolution also slows down the inference speed, especially on GPU devices. In this paper, we introduce a shallow and lightweight bilateral network (SLBNet). Our network inferences much faster than the existing methods while achieves competitive performance. We evaluate our networks on the MPII and COCO datasets. Specially, our SLBNet yields 67.8 Average Precision (AP) on COCO test set with only 3.6M parameters and 4.5G FLOPs at 253 FPS on a single 2080Ti GPU, and 25 FPS on an Intel i7-8700K CPU machine.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"424 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SLBNet: Shallow and Lightweight Bilateral Network for Pose Estimation\",\"authors\":\"Mao-qing Zhou, W. Sun, F. Yang, Sheng Zhang\",\"doi\":\"10.1109/CISP-BMEI53629.2021.9624376\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human pose estimation from images is an important task in many real-life applications. However, most existing methods focus on improving the effectiveness without considering efficiency, making the networks computationally expensive with a huge size. As depthwise separable convolution can help compress the model size and floating point operations (FLOPs), some methods combined it to make human pose estimation affordable on resource-constrained devices. However, depthwise separable convolution also slows down the inference speed, especially on GPU devices. In this paper, we introduce a shallow and lightweight bilateral network (SLBNet). Our network inferences much faster than the existing methods while achieves competitive performance. We evaluate our networks on the MPII and COCO datasets. Specially, our SLBNet yields 67.8 Average Precision (AP) on COCO test set with only 3.6M parameters and 4.5G FLOPs at 253 FPS on a single 2080Ti GPU, and 25 FPS on an Intel i7-8700K CPU machine.\",\"PeriodicalId\":131256,\"journal\":{\"name\":\"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"volume\":\"424 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISP-BMEI53629.2021.9624376\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SLBNet: Shallow and Lightweight Bilateral Network for Pose Estimation
Human pose estimation from images is an important task in many real-life applications. However, most existing methods focus on improving the effectiveness without considering efficiency, making the networks computationally expensive with a huge size. As depthwise separable convolution can help compress the model size and floating point operations (FLOPs), some methods combined it to make human pose estimation affordable on resource-constrained devices. However, depthwise separable convolution also slows down the inference speed, especially on GPU devices. In this paper, we introduce a shallow and lightweight bilateral network (SLBNet). Our network inferences much faster than the existing methods while achieves competitive performance. We evaluate our networks on the MPII and COCO datasets. Specially, our SLBNet yields 67.8 Average Precision (AP) on COCO test set with only 3.6M parameters and 4.5G FLOPs at 253 FPS on a single 2080Ti GPU, and 25 FPS on an Intel i7-8700K CPU machine.