{"title":"SES-yolov5:小物体图形检测和可视化应用","authors":"Fengling Li, Zheng Yang, Yan Gui","doi":"10.1007/s00371-024-03591-0","DOIUrl":null,"url":null,"abstract":"<p>Small object graphics detection plays a crucial role in various domains, including surveillance, urban management, and autonomous driving. However, existing object detection methods perform poorly when it comes to detecting multiple small objects. To tackle this issue, we propose the SES-yolov5 algorithm for small object detection that incorporates a multi-scale fusion attention mechanism and feature enhancement techniques. Firstly, we enhance the neck network structure by integrating shallow feature fusion (SFF) and small object detection head (STD), enabling the extraction of more detailed shallow feature information from high-resolution images. Secondly, we integrate an efficient channel and spatial attention (ECSA) mechanism into the backbone network to further filter redundant semantic information while highlighting the small objects for detection. Finally, we introduce a spatial feature refinement module (SFRM) to connect the main network with the neck network, enhancing rich features of input neck data while expanding the receptive field of images and minimizing loss of small object information. Experimental results on the VisDrone2021 dataset demonstrate that compared to traditional YOLOv5 algorithm, SES-yolov5 achieves an 8.3% increase in mAP50 score along with improved detection accuracy by 7.5% and increased recall rate by 6.4% on average. The effectiveness of our method is also validated on the TT100K dataset. Code is available at https://github.com/Yangzheng00/SES-yolov5.git.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"216 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SES-yolov5: small object graphics detection and visualization applications\",\"authors\":\"Fengling Li, Zheng Yang, Yan Gui\",\"doi\":\"10.1007/s00371-024-03591-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Small object graphics detection plays a crucial role in various domains, including surveillance, urban management, and autonomous driving. However, existing object detection methods perform poorly when it comes to detecting multiple small objects. To tackle this issue, we propose the SES-yolov5 algorithm for small object detection that incorporates a multi-scale fusion attention mechanism and feature enhancement techniques. Firstly, we enhance the neck network structure by integrating shallow feature fusion (SFF) and small object detection head (STD), enabling the extraction of more detailed shallow feature information from high-resolution images. Secondly, we integrate an efficient channel and spatial attention (ECSA) mechanism into the backbone network to further filter redundant semantic information while highlighting the small objects for detection. Finally, we introduce a spatial feature refinement module (SFRM) to connect the main network with the neck network, enhancing rich features of input neck data while expanding the receptive field of images and minimizing loss of small object information. Experimental results on the VisDrone2021 dataset demonstrate that compared to traditional YOLOv5 algorithm, SES-yolov5 achieves an 8.3% increase in mAP50 score along with improved detection accuracy by 7.5% and increased recall rate by 6.4% on average. The effectiveness of our method is also validated on the TT100K dataset. Code is available at https://github.com/Yangzheng00/SES-yolov5.git.</p>\",\"PeriodicalId\":501186,\"journal\":{\"name\":\"The Visual Computer\",\"volume\":\"216 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Visual Computer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00371-024-03591-0\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03591-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SES-yolov5: small object graphics detection and visualization applications
Small object graphics detection plays a crucial role in various domains, including surveillance, urban management, and autonomous driving. However, existing object detection methods perform poorly when it comes to detecting multiple small objects. To tackle this issue, we propose the SES-yolov5 algorithm for small object detection that incorporates a multi-scale fusion attention mechanism and feature enhancement techniques. Firstly, we enhance the neck network structure by integrating shallow feature fusion (SFF) and small object detection head (STD), enabling the extraction of more detailed shallow feature information from high-resolution images. Secondly, we integrate an efficient channel and spatial attention (ECSA) mechanism into the backbone network to further filter redundant semantic information while highlighting the small objects for detection. Finally, we introduce a spatial feature refinement module (SFRM) to connect the main network with the neck network, enhancing rich features of input neck data while expanding the receptive field of images and minimizing loss of small object information. Experimental results on the VisDrone2021 dataset demonstrate that compared to traditional YOLOv5 algorithm, SES-yolov5 achieves an 8.3% increase in mAP50 score along with improved detection accuracy by 7.5% and increased recall rate by 6.4% on average. The effectiveness of our method is also validated on the TT100K dataset. Code is available at https://github.com/Yangzheng00/SES-yolov5.git.