Crossmodality Person Reidentification Based on Global and Local Alignment

Qiong Lou, Junfeng Li, Yaguan Qian, Anlin Sun, Fang Lu
{"title":"Crossmodality Person Reidentification Based on Global and Local Alignment","authors":"Qiong Lou, Junfeng Li, Yaguan Qian, Anlin Sun, Fang Lu","doi":"10.1155/2022/4330804","DOIUrl":null,"url":null,"abstract":"RGB-infrared (RGB-IR) person reidentification is a challenge problem in computer vision due to the large crossmodality difference between RGB and IR images. Most traditional methods only carry out feature alignment, which ignores the uniqueness of modality differences and is difficult to eliminate the huge differences between RGB and IR. In this paper, a novel AGF network is proposed for RGB-IR re-ID task, which is based on the idea of global and local alignment. The AGF network distinguishes pedestrians in different modalities globally by combining pixel alignment and feature alignment and highlights more structure information of person locally by weighting channels with SE-ResNet-50, which has achieved ideal results. It consists of three modules, including alignGAN module (\n \n A\n \n ), crossmodality paired-images generation module (\n \n G\n \n ), and feature alignment module (\n \n F\n \n ). First, at pixel level, the RGB images are converted into IR images through the pixel alignment strategy to directly reduce the crossmodality difference between RGB and IR images. Second, at feature level, crossmodality paired images are generated by exchanging the modality-specific features of RGB and IR images to perform global set-level and fine-grained instance-level alignment. Finally, the SE-ResNet-50 network is used to replace the commonly used ResNet-50 network. By automatically learning the importance of different channel features, it strengthens the ability of the network to extract more fine-grained structural information of person crossmodalities. Extensive experimental results conducted on SYSU-MM01 dataset demonstrate that the proposed method favorably outperforms state-of-the-art methods. In addition, we evaluate the performance of the proposed method on a stronger baseline, and the evaluation results show that a RGB-IR re-ID method will show better performance on a stronger baseline.","PeriodicalId":23995,"journal":{"name":"Wirel. Commun. Mob. Comput.","volume":"106 11","pages":"4330804:1-4330804:13"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wirel. Commun. Mob. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/4330804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

RGB-infrared (RGB-IR) person reidentification is a challenge problem in computer vision due to the large crossmodality difference between RGB and IR images. Most traditional methods only carry out feature alignment, which ignores the uniqueness of modality differences and is difficult to eliminate the huge differences between RGB and IR. In this paper, a novel AGF network is proposed for RGB-IR re-ID task, which is based on the idea of global and local alignment. The AGF network distinguishes pedestrians in different modalities globally by combining pixel alignment and feature alignment and highlights more structure information of person locally by weighting channels with SE-ResNet-50, which has achieved ideal results. It consists of three modules, including alignGAN module ( A ), crossmodality paired-images generation module ( G ), and feature alignment module ( F ). First, at pixel level, the RGB images are converted into IR images through the pixel alignment strategy to directly reduce the crossmodality difference between RGB and IR images. Second, at feature level, crossmodality paired images are generated by exchanging the modality-specific features of RGB and IR images to perform global set-level and fine-grained instance-level alignment. Finally, the SE-ResNet-50 network is used to replace the commonly used ResNet-50 network. By automatically learning the importance of different channel features, it strengthens the ability of the network to extract more fine-grained structural information of person crossmodalities. Extensive experimental results conducted on SYSU-MM01 dataset demonstrate that the proposed method favorably outperforms state-of-the-art methods. In addition, we evaluate the performance of the proposed method on a stronger baseline, and the evaluation results show that a RGB-IR re-ID method will show better performance on a stronger baseline.
基于全局和局部对齐的跨模态人物再识别
由于RGB-红外(RGB-IR)图像与红外图像之间存在较大的交叉模态差异,因此RGB-IR人物再识别是计算机视觉中的一个难题。传统方法大多只进行特征对齐,忽略了模态差异的唯一性,难以消除RGB与IR之间的巨大差异。本文基于全局和局部对齐的思想,提出了一种用于RGB-IR重新识别任务的AGF网络。AGF网络通过结合像素对齐和特征对齐在全局范围内区分不同形态的行人,通过SE-ResNet-50加权通道在局部突出更多的人的结构信息,取得了理想的效果。它由三个模块组成,分别是alignGAN模块(A)、跨模态配对图像生成模块(G)和特征对齐模块(F)。首先,在像素级,通过像素对齐策略将RGB图像转换为IR图像,直接减小RGB图像与IR图像之间的交叉模态差异。其次,在特征级,通过交换RGB和IR图像的模态特定特征来生成跨模态配对图像,以执行全局集级和细粒度实例级对齐。最后,使用SE-ResNet-50网络替代常用的ResNet-50网络。通过自动学习不同通道特征的重要性,增强了网络提取更细粒度的人物交叉模式结构信息的能力。在SYSU-MM01数据集上进行的大量实验结果表明,所提出的方法优于最先进的方法。此外,我们还对该方法在更强基线下的性能进行了评估,评估结果表明RGB-IR重识别方法在更强基线下具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信