Boosting person ReID feature extraction via dynamic convolution

IF 3.7 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Analysis and Applications Pub Date : 2024-07-08 DOI:10.1007/s10044-024-01294-9

Elif Ecem Akbaba, Filiz Gurkan, Bilge Gunsel

{"title":"Boosting person ReID feature extraction via dynamic convolution","authors":"Elif Ecem Akbaba, Filiz Gurkan, Bilge Gunsel","doi":"10.1007/s10044-024-01294-9","DOIUrl":null,"url":null,"abstract":"<p>Extraction of discriminative features is crucial in person re-identification (ReID) which aims to match a query image of a person to her/his images, captured by different cameras. The conventional deep feature extraction methods on ReID employ CNNs with static convolutional kernels, where the kernel parameters are optimized during the training and remain constant in the inference. This approach limits the network's ability to model complex contents and decreases performance, particularly when dealing with occlusions or pose changes. In this work, to improve the performance without a significant increase in parameter size, we present a novel approach by utilizing a channel fusion-based dynamic convolution backbone network, which enables the kernels to change adaptively based on the input image, within two existing ReID network architectures. We replace the backbone network of two ReID methods to investigate the effect of dynamic convolution on both simple and complex networks. The first one called Baseline, is a simpler network with fewer layers, while the second, CaceNet represents a more complex architecture with higher performance. Evaluation results demonstrate that both of the designed dynamic networks improve identification accuracy compared to the static counterparts. A significant increase in accuracy is reported under occlusion tested on Occluded-DukeMTMC. Moreover, our approach achieves a performance comparable to the state-of-the-art on Market1501, DukeMTMC-reID, and CUHK03 with a limited computational load. These findings validate the effectiveness of the dynamic convolution in enhancing the person ReID networks and push the boundaries of performance in this domain.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"40 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Analysis and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10044-024-01294-9","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Extraction of discriminative features is crucial in person re-identification (ReID) which aims to match a query image of a person to her/his images, captured by different cameras. The conventional deep feature extraction methods on ReID employ CNNs with static convolutional kernels, where the kernel parameters are optimized during the training and remain constant in the inference. This approach limits the network's ability to model complex contents and decreases performance, particularly when dealing with occlusions or pose changes. In this work, to improve the performance without a significant increase in parameter size, we present a novel approach by utilizing a channel fusion-based dynamic convolution backbone network, which enables the kernels to change adaptively based on the input image, within two existing ReID network architectures. We replace the backbone network of two ReID methods to investigate the effect of dynamic convolution on both simple and complex networks. The first one called Baseline, is a simpler network with fewer layers, while the second, CaceNet represents a more complex architecture with higher performance. Evaluation results demonstrate that both of the designed dynamic networks improve identification accuracy compared to the static counterparts. A significant increase in accuracy is reported under occlusion tested on Occluded-DukeMTMC. Moreover, our approach achieves a performance comparable to the state-of-the-art on Market1501, DukeMTMC-reID, and CUHK03 with a limited computational load. These findings validate the effectiveness of the dynamic convolution in enhancing the person ReID networks and push the boundaries of performance in this domain.

Abstract Image

查看原文本刊更多论文

通过动态卷积增强人的 ReID 特征提取

人像再识别（ReID）的目的是将查询到的人像与不同相机拍摄到的人像进行匹配，而提取辨别特征对于人像再识别（ReID）至关重要。ReID 的传统深度特征提取方法采用具有静态卷积内核的 CNN，内核参数在训练过程中进行优化，并在推理过程中保持不变。这种方法限制了网络对复杂内容建模的能力，降低了性能，尤其是在处理遮挡或姿势变化时。在这项工作中，为了在不显著增加参数大小的情况下提高性能，我们提出了一种新方法，即在现有的两个 ReID 网络架构中，利用基于信道融合的动态卷积骨干网络，使内核能够根据输入图像自适应地变化。我们替换了两种 ReID 方法的骨干网络，以研究动态卷积对简单和复杂网络的影响。第一个网络名为 Baseline，是一个层数较少的简单网络，而第二个网络 CaceNet 则是一个性能更高的复杂架构。评估结果表明，与静态网络相比，两种设计的动态网络都提高了识别准确率。在 Occluded-DukeMTMC 上测试的闭塞情况下，准确率有了明显提高。此外，我们的方法还在 Market1501、DukeMTMC-reID 和 CUHK03 上以有限的计算负荷实现了与最先进方法相当的性能。这些发现验证了动态卷积在增强人物 ReID 网络方面的有效性，并推动了这一领域的性能发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Analysis and Applications 工程技术-计算机：人工智能

CiteScore

7.40

自引率

2.60%

发文量

审稿时长

13.5 months

期刊介绍： The journal publishes high quality articles in areas of fundamental research in intelligent pattern analysis and applications in computer science and engineering. It aims to provide a forum for original research which describes novel pattern analysis techniques and industrial applications of the current technology. In addition, the journal will also publish articles on pattern analysis applications in medical imaging. The journal solicits articles that detail new technology and methods for pattern recognition and analysis in applied domains including, but not limited to, computer vision and image processing, speech analysis, robotics, multimedia, document analysis, character recognition, knowledge engineering for pattern recognition, fractal analysis, and intelligent control. The journal publishes articles on the use of advanced pattern recognition and analysis methods including statistical techniques, neural networks, genetic algorithms, fuzzy pattern recognition, machine learning, and hardware implementations which are either relevant to the development of pattern analysis as a research area or detail novel pattern analysis applications. Papers proposing new classifier systems or their development, pattern analysis systems for real-time applications, fuzzy and temporal pattern recognition and uncertainty management in applied pattern recognition are particularly solicited.