{"title":"Complex overlapping pedestrian target detection network based on the yolov3 model","authors":"Yuchi Zhang","doi":"10.62051/rpbbxx55","DOIUrl":null,"url":null,"abstract":"This paper proposes a complex overlapping pedestrian target detection model based on yolov3 model by multi-scale feature fusion and context-aware mechanism. The SONY A7R3a camera shot the model on campus, and the data set was obtained after editing and collating. There were 358 high-definition videos with a resolution of 1920*1080, and the frame rate was 50HZ, about 179,000 frames. Through testing, this paper finds that compared with Single Shot Multibox Detector (SSD), the detection accuracy of the newly proposed model is slightly improved, the detection accuracy is the same as that of Faster R-CNN, and the detection accuracy of the newly proposed model is slightly worse than that of RetinaNet. However, the detection speed of Yolov3 is more than twice that of Single Shot Multibox Detector, RetinaNet and Faster R-CNN. The input size of Yolov3 is 320*320, and the processing of a single image only needs 22ms, so the detection speed of the simplified Yolov3 tiny is faster.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Computer Science and Intelligent Systems Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.62051/rpbbxx55","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a complex overlapping pedestrian target detection model based on yolov3 model by multi-scale feature fusion and context-aware mechanism. The SONY A7R3a camera shot the model on campus, and the data set was obtained after editing and collating. There were 358 high-definition videos with a resolution of 1920*1080, and the frame rate was 50HZ, about 179,000 frames. Through testing, this paper finds that compared with Single Shot Multibox Detector (SSD), the detection accuracy of the newly proposed model is slightly improved, the detection accuracy is the same as that of Faster R-CNN, and the detection accuracy of the newly proposed model is slightly worse than that of RetinaNet. However, the detection speed of Yolov3 is more than twice that of Single Shot Multibox Detector, RetinaNet and Faster R-CNN. The input size of Yolov3 is 320*320, and the processing of a single image only needs 22ms, so the detection speed of the simplified Yolov3 tiny is faster.