{"title":"PV-YOLO: An Object Detection Model for Panoramic Video based on YOLOv4","authors":"Pengfei Jia, Tie Yun, L. Qi, Fang Zhu","doi":"10.1109/CACML55074.2022.00018","DOIUrl":null,"url":null,"abstract":"Most existing object detection methods are applied in ordinary video of limited view. This significantly limits their usefulness and efficiency in real-world large scale deployments with the need for detecting across many views. To address this efficiency issue, we develop a novel object detection model suitable for detection in panoramic videos to achieve detection within a 360-degree panorama without the need to repeat detection in each view. Specifically, we make improvements on YOLOv4 and propose PV-YOLO, using deformable convolution in the backbone network to prevent the geometric deformation problem of targets and adding transverse skip connection in the feature fusion part of the model to enhance feature fusion. Extensive comparative evaluations validate the superiority of this new PV-YOLO model for object detection in panoramic video over a wide range of state-of-art methods on several challenging benchmarks including YOLOv4, YOLOv3, Faster-RCNN, and EfficientDet-D3, etc.","PeriodicalId":137505,"journal":{"name":"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CACML55074.2022.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Most existing object detection methods are applied in ordinary video of limited view. This significantly limits their usefulness and efficiency in real-world large scale deployments with the need for detecting across many views. To address this efficiency issue, we develop a novel object detection model suitable for detection in panoramic videos to achieve detection within a 360-degree panorama without the need to repeat detection in each view. Specifically, we make improvements on YOLOv4 and propose PV-YOLO, using deformable convolution in the backbone network to prevent the geometric deformation problem of targets and adding transverse skip connection in the feature fusion part of the model to enhance feature fusion. Extensive comparative evaluations validate the superiority of this new PV-YOLO model for object detection in panoramic video over a wide range of state-of-art methods on several challenging benchmarks including YOLOv4, YOLOv3, Faster-RCNN, and EfficientDet-D3, etc.