Vehicle Detection in Distorted Driving Video Based on Metric Learning and Single Shot MultiBox Detector

2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC) Pub Date : 2019-10-01 DOI:10.1109/BESC48373.2019.8963547

Fanghui Zhang, Yi Jin, Shichao Kan, Linna Zhang, Y. Cen, Jin Wen

{"title":"Vehicle Detection in Distorted Driving Video Based on Metric Learning and Single Shot MultiBox Detector","authors":"Fanghui Zhang, Yi Jin, Shichao Kan, Linna Zhang, Y. Cen, Jin Wen","doi":"10.1109/BESC48373.2019.8963547","DOIUrl":null,"url":null,"abstract":"With the gradually development of deep learning, the object detection algorithm has achieved remarkable applications, especially in the aspect of the automatic driving. Most of the object detection algorithms are used for pictures or videos obtained by a general camera. In practice, fisheye cameras are widely used, which will produce distorted image frames. The research of vehicle detection based on fisheye camera is relatively rare until now. If one network is trained on the existed public dataset, and tested on the distorted images or videos, the accuracy will decrease a lot. Thus, a distorted vehicle dataset needs to be manually labeled in the first. However, if we only use the distorted vehicle dataset to train the model, the mount of the distorted vehicle dataset is small, meanwhile the public datasets will not be fully used. On the other hand, the missing detection and false detection for the distorted images by using SSD algorithm is a considerable problem. Based on those considerations, firstly, transfer learning is adopted to transfer the parameters learned from the public vehicle dataset to the distorted vehicle dataset in this paper. Secondly, an algorithm named MLSSD for the distorted vehicle detection based on the labeled dataset is proposed to achieve a better performance for the vehicle detection, which mainly combines metric learning and SSD algorithm to enormously alleviate the missing detection and false detection. In addition, the scalable overlapping partition pooling (SOPP) method is proposed instead of the spatial pyramid pooling to achieve more robust feature map pooling. Experimental results show that the proposed MLSSD algorithm significantly outperforms other algorithms and achieves 88.3 % mAP on the distorted vehicle dataset, 3.1% more than the result obtained by the SSD network.","PeriodicalId":190867,"journal":{"name":"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BESC48373.2019.8963547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

With the gradually development of deep learning, the object detection algorithm has achieved remarkable applications, especially in the aspect of the automatic driving. Most of the object detection algorithms are used for pictures or videos obtained by a general camera. In practice, fisheye cameras are widely used, which will produce distorted image frames. The research of vehicle detection based on fisheye camera is relatively rare until now. If one network is trained on the existed public dataset, and tested on the distorted images or videos, the accuracy will decrease a lot. Thus, a distorted vehicle dataset needs to be manually labeled in the first. However, if we only use the distorted vehicle dataset to train the model, the mount of the distorted vehicle dataset is small, meanwhile the public datasets will not be fully used. On the other hand, the missing detection and false detection for the distorted images by using SSD algorithm is a considerable problem. Based on those considerations, firstly, transfer learning is adopted to transfer the parameters learned from the public vehicle dataset to the distorted vehicle dataset in this paper. Secondly, an algorithm named MLSSD for the distorted vehicle detection based on the labeled dataset is proposed to achieve a better performance for the vehicle detection, which mainly combines metric learning and SSD algorithm to enormously alleviate the missing detection and false detection. In addition, the scalable overlapping partition pooling (SOPP) method is proposed instead of the spatial pyramid pooling to achieve more robust feature map pooling. Experimental results show that the proposed MLSSD algorithm significantly outperforms other algorithms and achieves 88.3 % mAP on the distorted vehicle dataset, 3.1% more than the result obtained by the SSD network.

查看原文本刊更多论文

基于度量学习和单镜头多盒检测器的失真驾驶视频车辆检测

随着深度学习的逐步发展，目标检测算法已经取得了显著的应用，特别是在自动驾驶方面。大多数的目标检测算法都是针对普通摄像机获取的图像或视频。在实际应用中，鱼眼相机被广泛使用，会产生失真的图像帧。目前基于鱼眼相机的车辆检测研究相对较少。如果在现有的公共数据集上训练一个网络，然后在扭曲的图像或视频上进行测试，准确率会下降很多。因此，首先需要对扭曲的车辆数据集进行手动标记。然而，如果我们只使用扭曲的车辆数据集来训练模型，扭曲的车辆数据集的容量很小，同时不能充分利用公共数据集。另一方面，SSD算法对畸变图像的缺失检测和误检是一个相当大的问题。在此基础上，本文首先采用迁移学习方法，将从公共车辆数据集中学习到的参数迁移到变形车辆数据集中;其次，提出了一种基于标记数据集的变形车辆检测算法MLSSD，该算法主要将度量学习和SSD算法相结合，极大地缓解了缺失检测和误检问题。此外，提出了可扩展重叠分区池化(SOPP)方法来代替空间金字塔池化，实现了更鲁棒的特征映射池化。实验结果表明，所提出的MLSSD算法显著优于其他算法，在扭曲车辆数据集上的mAP达到88.3%，比SSD网络的mAP提高3.1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)

自引率

0.00%

发文量