研究一种平衡精度和速度的多人姿态估计模型

IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Xiangdong Gao, Liying Sun, Fan Zhang
{"title":"研究一种平衡精度和速度的多人姿态估计模型","authors":"Xiangdong Gao,&nbsp;Liying Sun,&nbsp;Fan Zhang","doi":"10.1016/j.image.2025.117396","DOIUrl":null,"url":null,"abstract":"<div><div>This article presents BASP_YOLO, an enhanced multi-person pose estimation model designed to balance accuracy and speed for real-world applications. To address the computational complexity and limited robustness of existing methods, the proposed model integrates lightweight DSConv layers, a multi-scale fusion module combining BiFPN and efficient attention mechanisms, an optimized spatial pyramid pooling module with CSPC connections, and an SPD-DS module to mitigate channel information loss. Evaluated on the MS COCO dataset, BASP_YOLO achieves a [email protected] of 84.6 % at 54 FPS, outperforming mainstream models like YOLO-Pose and OpenPose. The improvements reduce computational load by 52.2 % while enhancing occlusion handling, small-object detection, and robustness to environmental interference. The effectiveness of the model improvements was further validated using the MPII dataset. This work improves the accuracy of pose estimation while compromising real-time performance as little as possible, advancing deployment feasibility in resource-constrained scenarios.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"139 ","pages":"Article 117396"},"PeriodicalIF":2.7000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on a multi-person pose estimation model to balance accuracy and speed\",\"authors\":\"Xiangdong Gao,&nbsp;Liying Sun,&nbsp;Fan Zhang\",\"doi\":\"10.1016/j.image.2025.117396\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This article presents BASP_YOLO, an enhanced multi-person pose estimation model designed to balance accuracy and speed for real-world applications. To address the computational complexity and limited robustness of existing methods, the proposed model integrates lightweight DSConv layers, a multi-scale fusion module combining BiFPN and efficient attention mechanisms, an optimized spatial pyramid pooling module with CSPC connections, and an SPD-DS module to mitigate channel information loss. Evaluated on the MS COCO dataset, BASP_YOLO achieves a [email protected] of 84.6 % at 54 FPS, outperforming mainstream models like YOLO-Pose and OpenPose. The improvements reduce computational load by 52.2 % while enhancing occlusion handling, small-object detection, and robustness to environmental interference. The effectiveness of the model improvements was further validated using the MPII dataset. This work improves the accuracy of pose estimation while compromising real-time performance as little as possible, advancing deployment feasibility in resource-constrained scenarios.</div></div>\",\"PeriodicalId\":49521,\"journal\":{\"name\":\"Signal Processing-Image Communication\",\"volume\":\"139 \",\"pages\":\"Article 117396\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing-Image Communication\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0923596525001420\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing-Image Communication","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0923596525001420","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了BASP_YOLO,这是一种增强的多人姿态估计模型,旨在平衡真实应用程序的准确性和速度。为了解决现有方法的计算复杂性和有限的鲁棒性,该模型集成了轻量级DSConv层、结合BiFPN和高效注意机制的多尺度融合模块、具有CSPC连接的优化空间金字塔池模块以及减少信道信息丢失的SPD-DS模块。在MS COCO数据集上进行评估,BASP_YOLO在54 FPS下达到了84.6%的[email protected],优于主流模型如YOLO-Pose和OpenPose。这些改进减少了52.2%的计算负荷,同时增强了遮挡处理、小目标检测和对环境干扰的鲁棒性。利用MPII数据集进一步验证了模型改进的有效性。这项工作提高了姿态估计的准确性,同时尽可能少地影响实时性能,提高了在资源受限场景下部署的可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Research on a multi-person pose estimation model to balance accuracy and speed
This article presents BASP_YOLO, an enhanced multi-person pose estimation model designed to balance accuracy and speed for real-world applications. To address the computational complexity and limited robustness of existing methods, the proposed model integrates lightweight DSConv layers, a multi-scale fusion module combining BiFPN and efficient attention mechanisms, an optimized spatial pyramid pooling module with CSPC connections, and an SPD-DS module to mitigate channel information loss. Evaluated on the MS COCO dataset, BASP_YOLO achieves a [email protected] of 84.6 % at 54 FPS, outperforming mainstream models like YOLO-Pose and OpenPose. The improvements reduce computational load by 52.2 % while enhancing occlusion handling, small-object detection, and robustness to environmental interference. The effectiveness of the model improvements was further validated using the MPII dataset. This work improves the accuracy of pose estimation while compromising real-time performance as little as possible, advancing deployment feasibility in resource-constrained scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Signal Processing-Image Communication
Signal Processing-Image Communication 工程技术-工程:电子与电气
CiteScore
8.40
自引率
2.90%
发文量
138
审稿时长
5.2 months
期刊介绍: Signal Processing: Image Communication is an international journal for the development of the theory and practice of image communication. Its primary objectives are the following: To present a forum for the advancement of theory and practice of image communication. To stimulate cross-fertilization between areas similar in nature which have traditionally been separated, for example, various aspects of visual communications and information systems. To contribute to a rapid information exchange between the industrial and academic environments. The editorial policy and the technical content of the journal are the responsibility of the Editor-in-Chief, the Area Editors and the Advisory Editors. The Journal is self-supporting from subscription income and contains a minimum amount of advertisements. Advertisements are subject to the prior approval of the Editor-in-Chief. The journal welcomes contributions from every country in the world. Signal Processing: Image Communication publishes articles relating to aspects of the design, implementation and use of image communication systems. The journal features original research work, tutorial and review articles, and accounts of practical developments. Subjects of interest include image/video coding, 3D video representations and compression, 3D graphics and animation compression, HDTV and 3DTV systems, video adaptation, video over IP, peer-to-peer video networking, interactive visual communication, multi-user video conferencing, wireless video broadcasting and communication, visual surveillance, 2D and 3D image/video quality measures, pre/post processing, video restoration and super-resolution, multi-camera video analysis, motion analysis, content-based image/video indexing and retrieval, face and gesture processing, video synthesis, 2D and 3D image/video acquisition and display technologies, architectures for image/video processing and communication.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信