车辆再识别的局部增强多分辨率表示学习

ACM Multimedia Asia Pub Date : 2021-12-01 DOI:10.1145/3469877.3497690

Jun Zhang, X. Zhong, Jingling Yuan, Shilei Zhao, Rongbo Zhang, Duxiu Feng, Luo Zhong

{"title":"车辆再识别的局部增强多分辨率表示学习","authors":"Jun Zhang, X. Zhong, Jingling Yuan, Shilei Zhao, Rongbo Zhang, Duxiu Feng, Luo Zhong","doi":"10.1145/3469877.3497690","DOIUrl":null,"url":null,"abstract":"In real traffic scenarios, the changes of vehicle resolution that the camera captures tend to be relatively obvious considering the distances to the vehicle, different directions, and height of the camera. When the resolution difference exists between the probe and the gallery vehicle, the resolution mismatch will occur, which will seriously influence the performance of the vehicle re-identification (Re-ID). This problem is also known as multi-resolution vehicle Re-ID. An effective strategy is equivalent to utilize image super-resolution to handle the resolution gap. However, existing methods conduct super-resolution on global images instead of local representation of each image, leading to much more noisy information generated from the background and illumination variations. In our work, a local-enhanced multi-resolution representation learning (LMRL) is therefore proposed to address these problems by combining the training of local-enhanced super-resolution (LSR) module and local-guided contrastive learning (LCL) module. Specifically, we use a parsing network to parse a vehicle into four different parts to extract local-enhanced vehicle representation. And then, the LSR module, which consists of two auto-encoders that share parameters, transforms low-resolution images into high-resolution in both global and local branches. LCL module can learn discriminative vehicle representation by contrasting local representation between the high-resolution reconstructed image and the ground truth. We evaluate our approach on two public datasets that contain vehicle images at a wide range of resolutions, in which our approach shows significant superiority to the existing solution.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"160 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Local-enhanced Multi-resolution Representation Learning for Vehicle Re-identification\",\"authors\":\"Jun Zhang, X. Zhong, Jingling Yuan, Shilei Zhao, Rongbo Zhang, Duxiu Feng, Luo Zhong\",\"doi\":\"10.1145/3469877.3497690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In real traffic scenarios, the changes of vehicle resolution that the camera captures tend to be relatively obvious considering the distances to the vehicle, different directions, and height of the camera. When the resolution difference exists between the probe and the gallery vehicle, the resolution mismatch will occur, which will seriously influence the performance of the vehicle re-identification (Re-ID). This problem is also known as multi-resolution vehicle Re-ID. An effective strategy is equivalent to utilize image super-resolution to handle the resolution gap. However, existing methods conduct super-resolution on global images instead of local representation of each image, leading to much more noisy information generated from the background and illumination variations. In our work, a local-enhanced multi-resolution representation learning (LMRL) is therefore proposed to address these problems by combining the training of local-enhanced super-resolution (LSR) module and local-guided contrastive learning (LCL) module. Specifically, we use a parsing network to parse a vehicle into four different parts to extract local-enhanced vehicle representation. And then, the LSR module, which consists of two auto-encoders that share parameters, transforms low-resolution images into high-resolution in both global and local branches. LCL module can learn discriminative vehicle representation by contrasting local representation between the high-resolution reconstructed image and the ground truth. We evaluate our approach on two public datasets that contain vehicle images at a wide range of resolutions, in which our approach shows significant superiority to the existing solution.\",\"PeriodicalId\":210974,\"journal\":{\"name\":\"ACM Multimedia Asia\",\"volume\":\"160 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Multimedia Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3469877.3497690\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469877.3497690","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在真实的交通场景中，考虑到与车辆的距离、不同的方向和摄像机的高度，摄像机捕捉到的车辆分辨率的变化往往比较明显。当探针与通道车辆存在分辨率差异时，会产生分辨率失配，严重影响车辆再识别的性能。这个问题也被称为多分辨率车辆Re-ID。一种有效的策略相当于利用图像超分辨率来处理分辨率差距。然而，现有方法对全局图像进行超分辨率处理，而不是对每张图像进行局部表示，导致背景和光照变化产生更多的噪声信息。在我们的工作中，我们提出了一种局部增强的多分辨率表示学习(LMRL)，通过结合局部增强的超分辨率(LSR)模块和局部引导的对比学习(LCL)模块的训练来解决这些问题。具体来说，我们使用解析网络将车辆解析为四个不同的部分，以提取局部增强的车辆表示。然后，由两个共享参数的自编码器组成的LSR模块将低分辨率图像转换为全球和本地分支的高分辨率图像。LCL模块通过对比高分辨率重建图像与地面真实图像的局部表示，学习判别性车辆表示。我们在两个公共数据集上评估了我们的方法，这些数据集包含各种分辨率的车辆图像，在这些数据集上，我们的方法比现有的解决方案显示出显著的优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Local-enhanced Multi-resolution Representation Learning for Vehicle Re-identification

In real traffic scenarios, the changes of vehicle resolution that the camera captures tend to be relatively obvious considering the distances to the vehicle, different directions, and height of the camera. When the resolution difference exists between the probe and the gallery vehicle, the resolution mismatch will occur, which will seriously influence the performance of the vehicle re-identification (Re-ID). This problem is also known as multi-resolution vehicle Re-ID. An effective strategy is equivalent to utilize image super-resolution to handle the resolution gap. However, existing methods conduct super-resolution on global images instead of local representation of each image, leading to much more noisy information generated from the background and illumination variations. In our work, a local-enhanced multi-resolution representation learning (LMRL) is therefore proposed to address these problems by combining the training of local-enhanced super-resolution (LSR) module and local-guided contrastive learning (LCL) module. Specifically, we use a parsing network to parse a vehicle into four different parts to extract local-enhanced vehicle representation. And then, the LSR module, which consists of two auto-encoders that share parameters, transforms low-resolution images into high-resolution in both global and local branches. LCL module can learn discriminative vehicle representation by contrasting local representation between the high-resolution reconstructed image and the ground truth. We evaluate our approach on two public datasets that contain vehicle images at a wide range of resolutions, in which our approach shows significant superiority to the existing solution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Multimedia Asia

自引率

0.00%

发文量