实现视图无关车型识别中的边缘计算能力

IF 4.8 Q2 TRANSPORTATION

International Journal of Transportation Science and Technology Pub Date : 2024-06-01 DOI:10.1016/j.ijtst.2023.03.007

{"title":"实现视图无关车型识别中的边缘计算能力","authors":"","doi":"10.1016/j.ijtst.2023.03.007","DOIUrl":null,"url":null,"abstract":"<div><p>Vehicle model recognition (VMR) benefits the parking, surveillance, and tolling system by automatically identifying the exact make and model of the passing vehicles. Edge computing technology enables the roadside facilities and mobile cameras to achcieve VMR in real-time. Current work generally relies on a specific view of the vehicle or requires huge calculation capability to deploy the end-to-end deep learning network. This paper proposes a lightweight two-stage identification method based on object detection and image retrieval techniques, which empowers us the ability of recognizing the vehicle model from an arbitrary view. The first-stage model estimates the vehicle posture using object detection and similarity matching, which is cost-efficient and suitable to be programmed in the edge computing devices; the second-stage model retrieves the vehicle’s label from the dataset based on gradient boosting decision tree (GBDT) algorithm and VGGNet, which is flexible to the changing dataset. More than 8 000 vehicle images are labeled with their components’ information, such as headlights, windows, wheels, and logos. The YOLO network is employed to detect and localize the typical components of a vehicle. The vehicle postures are estimated by the spatial relationship between different segmented components. Due to the variety of the perspectives, a 7-dimensional vector is defined to represent the relative posture of the vehicle and screen out the images with a similar photographic perspective. Two algorithms are used to extract the features from each image patch: (1) the scale invariant feature transform (SIFT) combined with the bag-of-features (BoF) and (2) pre-trained deep neural network. The GBDT is applied to evaluate the weight of each component regarding its impact on VMR. The descriptors of each component are then aggregated to retrieve the best matching image from the database. The results showed its advantages in terms of accuracy (89.2%) and efficiency, demonstrating the vast potential of applying this method to large-scale vehicle model recognition.</p></div>","PeriodicalId":52282,"journal":{"name":"International Journal of Transportation Science and Technology","volume":"14 ","pages":"Pages 73-86"},"PeriodicalIF":4.8000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S204604302300028X/pdfft?md5=ce6f5579d9069f7f5e9ff520676a8fd5&pid=1-s2.0-S204604302300028X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Enabling edge computing ability in view-independent vehicle model recognition\",\"authors\":\"\",\"doi\":\"10.1016/j.ijtst.2023.03.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Vehicle model recognition (VMR) benefits the parking, surveillance, and tolling system by automatically identifying the exact make and model of the passing vehicles. Edge computing technology enables the roadside facilities and mobile cameras to achcieve VMR in real-time. Current work generally relies on a specific view of the vehicle or requires huge calculation capability to deploy the end-to-end deep learning network. This paper proposes a lightweight two-stage identification method based on object detection and image retrieval techniques, which empowers us the ability of recognizing the vehicle model from an arbitrary view. The first-stage model estimates the vehicle posture using object detection and similarity matching, which is cost-efficient and suitable to be programmed in the edge computing devices; the second-stage model retrieves the vehicle’s label from the dataset based on gradient boosting decision tree (GBDT) algorithm and VGGNet, which is flexible to the changing dataset. More than 8 000 vehicle images are labeled with their components’ information, such as headlights, windows, wheels, and logos. The YOLO network is employed to detect and localize the typical components of a vehicle. The vehicle postures are estimated by the spatial relationship between different segmented components. Due to the variety of the perspectives, a 7-dimensional vector is defined to represent the relative posture of the vehicle and screen out the images with a similar photographic perspective. Two algorithms are used to extract the features from each image patch: (1) the scale invariant feature transform (SIFT) combined with the bag-of-features (BoF) and (2) pre-trained deep neural network. The GBDT is applied to evaluate the weight of each component regarding its impact on VMR. The descriptors of each component are then aggregated to retrieve the best matching image from the database. The results showed its advantages in terms of accuracy (89.2%) and efficiency, demonstrating the vast potential of applying this method to large-scale vehicle model recognition.</p></div>\",\"PeriodicalId\":52282,\"journal\":{\"name\":\"International Journal of Transportation Science and Technology\",\"volume\":\"14 \",\"pages\":\"Pages 73-86\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S204604302300028X/pdfft?md5=ce6f5579d9069f7f5e9ff520676a8fd5&pid=1-s2.0-S204604302300028X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Transportation Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S204604302300028X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Transportation Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S204604302300028X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 0

摘要

车型识别（VMR）可自动识别过往车辆的确切品牌和车型，从而使停车、监控和收费系统受益。边缘计算技术使路边设施和移动摄像头能够实时实现 VMR。目前的工作通常依赖于车辆的特定视图，或者需要巨大的计算能力来部署端到端的深度学习网络。本文提出了一种基于物体检测和图像检索技术的轻量级两阶段识别方法，使我们能够从任意视角识别车辆模型。第一阶段模型利用物体检测和相似性匹配估算车辆姿态，成本效益高，适合在边缘计算设备中编程；第二阶段模型基于梯度提升决策树（GBDT）算法和 VGGNet 从数据集中检索车辆标签，可灵活应对不断变化的数据集。8000 多张汽车图片标注了其部件信息，如车灯、车窗、车轮和徽标。YOLO 网络用于检测和定位车辆的典型部件。车辆姿态是通过不同分割组件之间的空间关系估算出来的。由于视角的多样性，我们定义了一个 7 维向量来表示车辆的相对姿态，并筛选出具有相似摄影视角的图像。从每个图像片段中提取特征使用了两种算法：(1) 结合特征包（BoF）的尺度不变特征变换（SIFT）和 (2) 预先训练的深度神经网络。GBDT 用于评估每个分量对 VMR 影响的权重。然后汇总每个组件的描述符，从数据库中检索出最佳匹配图像。结果显示了该方法在准确率（89.2%）和效率方面的优势，证明了将该方法应用于大规模车辆模型识别的巨大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enabling edge computing ability in view-independent vehicle model recognition

Vehicle model recognition (VMR) benefits the parking, surveillance, and tolling system by automatically identifying the exact make and model of the passing vehicles. Edge computing technology enables the roadside facilities and mobile cameras to achcieve VMR in real-time. Current work generally relies on a specific view of the vehicle or requires huge calculation capability to deploy the end-to-end deep learning network. This paper proposes a lightweight two-stage identification method based on object detection and image retrieval techniques, which empowers us the ability of recognizing the vehicle model from an arbitrary view. The first-stage model estimates the vehicle posture using object detection and similarity matching, which is cost-efficient and suitable to be programmed in the edge computing devices; the second-stage model retrieves the vehicle’s label from the dataset based on gradient boosting decision tree (GBDT) algorithm and VGGNet, which is flexible to the changing dataset. More than 8 000 vehicle images are labeled with their components’ information, such as headlights, windows, wheels, and logos. The YOLO network is employed to detect and localize the typical components of a vehicle. The vehicle postures are estimated by the spatial relationship between different segmented components. Due to the variety of the perspectives, a 7-dimensional vector is defined to represent the relative posture of the vehicle and screen out the images with a similar photographic perspective. Two algorithms are used to extract the features from each image patch: (1) the scale invariant feature transform (SIFT) combined with the bag-of-features (BoF) and (2) pre-trained deep neural network. The GBDT is applied to evaluate the weight of each component regarding its impact on VMR. The descriptors of each component are then aggregated to retrieve the best matching image from the database. The results showed its advantages in terms of accuracy (89.2%) and efficiency, demonstrating the vast potential of applying this method to large-scale vehicle model recognition.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊