Multi-Context Grouped Attention for Unsupervised Person Re-Identification

Kshitij Nikhal;Benjamin S. Riggan
{"title":"Multi-Context Grouped Attention for Unsupervised Person Re-Identification","authors":"Kshitij Nikhal;Benjamin S. Riggan","doi":"10.1109/TBIOM.2022.3226678","DOIUrl":null,"url":null,"abstract":"Recent advancements like multiple contextual analysis, attention mechanisms, distance-aware optimization, and multi-task guidance have been widely used for supervised person re-identification (ReID), but the implementation and effects of such methods in unsupervised ReID frameworks are non-trivial and unclear, respectively. Moreover, with increasing size and complexity of image- and video-based ReID datasets, manual or semi-automated annotation procedures for supervised ReID are becoming labor intensive and cost prohibitive, which is undesirable especially considering the likelihood of annotation errors increase with scale/complexity of data collections. Therefore, we propose a new iterative clustering framework that is insensitive to annotation errors and over-fitting ReID annotations (i.e., labels). Our proposed unsupervised framework incorporates (a) a novel multi-context group attention architecture that learns a holistic attention map from multiple local and global contexts, (b) an unsupervised clustering loss function that down-weights easily discriminative identities, and (c) a background diversity term that helps cluster persons across different cross-camera views without leveraging any identification or camera labels. We perform extensive analysis using the DukeMTMC-VideoReID and MARS video-based ReID datasets and the MSMT17 image-based ReID dataset. Our approach is shown to provide a new state-of-the-art performance for unsupervised ReID, reducing the rank-1 performance gap between supervised and unsupervised ReID to 1.1%, 12.1%, and 21.9% from 6.1%, 17.9%, and 22.6% for DukeMTMC, MARS, and MSMT17 datasets, respectively.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"170-182"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9978648/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Recent advancements like multiple contextual analysis, attention mechanisms, distance-aware optimization, and multi-task guidance have been widely used for supervised person re-identification (ReID), but the implementation and effects of such methods in unsupervised ReID frameworks are non-trivial and unclear, respectively. Moreover, with increasing size and complexity of image- and video-based ReID datasets, manual or semi-automated annotation procedures for supervised ReID are becoming labor intensive and cost prohibitive, which is undesirable especially considering the likelihood of annotation errors increase with scale/complexity of data collections. Therefore, we propose a new iterative clustering framework that is insensitive to annotation errors and over-fitting ReID annotations (i.e., labels). Our proposed unsupervised framework incorporates (a) a novel multi-context group attention architecture that learns a holistic attention map from multiple local and global contexts, (b) an unsupervised clustering loss function that down-weights easily discriminative identities, and (c) a background diversity term that helps cluster persons across different cross-camera views without leveraging any identification or camera labels. We perform extensive analysis using the DukeMTMC-VideoReID and MARS video-based ReID datasets and the MSMT17 image-based ReID dataset. Our approach is shown to provide a new state-of-the-art performance for unsupervised ReID, reducing the rank-1 performance gap between supervised and unsupervised ReID to 1.1%, 12.1%, and 21.9% from 6.1%, 17.9%, and 22.6% for DukeMTMC, MARS, and MSMT17 datasets, respectively.
无监督人再识别的多情境分组注意
多上下文分析、注意机制、距离感知优化和多任务指导等最新进展已被广泛用于有监督的人再识别,但这些方法在无监督的人再识别框架中的实施和效果分别是不容忽视的和不明确的。此外,随着基于图像和视频的ReID数据集的规模和复杂性的增加,人工或半自动的ReID注释过程变得劳动密集型和成本过高,这是不可取的,特别是考虑到注释错误的可能性随着数据收集的规模/复杂性而增加。因此,我们提出了一种新的迭代聚类框架,该框架对标注错误和ReID标注(即标签)不敏感。我们提出的无监督框架包含(a)一种新的多上下文群体注意力架构,它从多个局部和全局上下文中学习整体注意力图,(b)一种无监督聚类损失函数,它可以轻松地降低鉴别身份的权重,以及(c)一个背景多样性术语,它可以帮助聚类不同跨相机视图的人,而无需利用任何识别或相机标签。我们使用DukeMTMC-VideoReID和MARS基于视频的ReID数据集以及MSMT17基于图像的ReID数据集进行了广泛的分析。我们的方法被证明为无监督ReID提供了新的最先进的性能,将有监督和无监督ReID之间的排名1的性能差距分别从DukeMTMC、MARS和MSMT17数据集的6.1%、17.9%和22.6%降低到1.1%、12.1%和21.9%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信