PS-ARM: An End-to-End Attention-aware Relation Mixer Network for Person Search

M. Fiaz, Hisham Cholakkal, Sanath Narayan, R. Anwer, F. Khan
{"title":"PS-ARM: An End-to-End Attention-aware Relation Mixer Network for Person Search","authors":"M. Fiaz, Hisham Cholakkal, Sanath Narayan, R. Anwer, F. Khan","doi":"10.48550/arXiv.2210.03433","DOIUrl":null,"url":null,"abstract":"Person search is a challenging problem with various real-world applications, that aims at joint person detection and re-identification of a query person from uncropped gallery images. Although, the previous study focuses on rich feature information learning, it is still hard to retrieve the query person due to the occurrence of appearance deformations and background distractors. In this paper, we propose a novel attention-aware relation mixer (ARM) module for person search, which exploits the global relation between different local regions within RoI of a person and make it robust against various appearance deformations and occlusion. The proposed ARM is composed of a relation mixer block and a spatio-channel attention layer. The relation mixer block introduces a spatially attended spatial mixing and a channel-wise attended channel mixing for effectively capturing discriminative relation features within an RoI. These discriminative relation features are further enriched by introducing a spatio-channel attention where the foreground and background discriminability is empowered in a joint spatio-channel space. Our ARM module is generic and it does not rely on fine-grained supervision or topological assumptions, hence being easily integrated into any Faster R-CNN based person search methods. Comprehensive experiments are performed on two challenging benchmark datasets: CUHKSYSU and PRW. Our PS-ARM achieves state-of-the-art performance on both datasets. On the challenging PRW dataset, our PS-ARM achieves an absolute gain of 5 in the mAP score over SeqNet, while operating at a comparable speed.","PeriodicalId":87238,"journal":{"name":"Computer vision - ACCV ... : ... Asian Conference on Computer Vision : proceedings. Asian Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer vision - ACCV ... : ... Asian Conference on Computer Vision : proceedings. Asian Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.03433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Person search is a challenging problem with various real-world applications, that aims at joint person detection and re-identification of a query person from uncropped gallery images. Although, the previous study focuses on rich feature information learning, it is still hard to retrieve the query person due to the occurrence of appearance deformations and background distractors. In this paper, we propose a novel attention-aware relation mixer (ARM) module for person search, which exploits the global relation between different local regions within RoI of a person and make it robust against various appearance deformations and occlusion. The proposed ARM is composed of a relation mixer block and a spatio-channel attention layer. The relation mixer block introduces a spatially attended spatial mixing and a channel-wise attended channel mixing for effectively capturing discriminative relation features within an RoI. These discriminative relation features are further enriched by introducing a spatio-channel attention where the foreground and background discriminability is empowered in a joint spatio-channel space. Our ARM module is generic and it does not rely on fine-grained supervision or topological assumptions, hence being easily integrated into any Faster R-CNN based person search methods. Comprehensive experiments are performed on two challenging benchmark datasets: CUHKSYSU and PRW. Our PS-ARM achieves state-of-the-art performance on both datasets. On the challenging PRW dataset, our PS-ARM achieves an absolute gain of 5 in the mAP score over SeqNet, while operating at a comparable speed.
PS-ARM:一个端到端关注感知的人际关系混合器网络
人物搜索是各种现实应用中的一个具有挑战性的问题,其目的是联合人员检测和从未裁剪的图库图像中重新识别查询人员。虽然以往的研究侧重于丰富特征信息的学习,但由于存在外观变形和背景干扰因素,仍然难以检索到查询人。在本文中,我们提出了一种新的关注感知关系混合器(ARM)模块用于人物搜索,该模块利用人的RoI内不同局部区域之间的全局关系,使其对各种外观变形和遮挡具有鲁棒性。该ARM由一个关系混频器块和一个空间信道注意层组成。关系混频器块引入了空间参与的空间混合和通道参与的通道混合,用于有效捕获RoI内的判别关系特征。通过引入空间通道注意,在联合空间通道空间中赋予前景和背景可辨别性,进一步丰富了这些区别关系特征。我们的ARM模块是通用的,它不依赖于细粒度的监督或拓扑假设,因此很容易集成到任何更快的基于R-CNN的人员搜索方法中。在两个具有挑战性的基准数据集上进行了全面的实验:中大中山大学和PRW。我们的PS-ARM在这两个数据集上都实现了最先进的性能。在具有挑战性的PRW数据集上,我们的PS-ARM在mAP得分上比SeqNet获得了5分的绝对增益,同时以相当的速度运行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信