基于视频的可见-红外人物再识别的层次扰动与群体推理

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2024-12-21 DOI:10.1016/j.inffus.2024.102882

Chuhao Zhou , Yuzhe Zhou , Tingting Ren , Huafeng Li , Jinxing Li , Guangming Lu

{"title":"基于视频的可见-红外人物再识别的层次扰动与群体推理","authors":"Chuhao Zhou , Yuzhe Zhou , Tingting Ren , Huafeng Li , Jinxing Li , Guangming Lu","doi":"10.1016/j.inffus.2024.102882","DOIUrl":null,"url":null,"abstract":"<div><div>Video-based Visible-Infrared person Re-identification (VVI-ReID) is challenging due to the large inter-view and inter-modal discrepancies. To alleviate these discrepancies, most existing works only focus on whole images, while more id-related partial information is ignored. Furthermore, the inference decision is commonly based on the similarity of two samples. However, the semantic gap between the query and gallery samples inevitably exists due to their inter-view misalignment, no matter whether the modality-gap is removed. In this paper, we proposed a Hierarchical Disturbance (HD) and Group Inference (GI) method to handle aforementioned issues. Specifically, the HD module models the inter-view and inter-modal discrepancies as multiple image styles, and conducts feature disturbances through partially transferring body styles. By hierarchically taking the partial and global features into account, our model is capable of adaptively achieving invariant but identity-related features. Additionally, instead of establishing similarity between the query sample and each gallery sample independently, the GI module is further introduced to extract complementary information from all potential intra-class gallery samples of the given query sample, which boosts the performance on matching hard samples. Extensive experiments substantiate the superiority of our method compared with state-of-the arts.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102882"},"PeriodicalIF":14.7000,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical disturbance and Group Inference for video-based visible-infrared person re-identification\",\"authors\":\"Chuhao Zhou , Yuzhe Zhou , Tingting Ren , Huafeng Li , Jinxing Li , Guangming Lu\",\"doi\":\"10.1016/j.inffus.2024.102882\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Video-based Visible-Infrared person Re-identification (VVI-ReID) is challenging due to the large inter-view and inter-modal discrepancies. To alleviate these discrepancies, most existing works only focus on whole images, while more id-related partial information is ignored. Furthermore, the inference decision is commonly based on the similarity of two samples. However, the semantic gap between the query and gallery samples inevitably exists due to their inter-view misalignment, no matter whether the modality-gap is removed. In this paper, we proposed a Hierarchical Disturbance (HD) and Group Inference (GI) method to handle aforementioned issues. Specifically, the HD module models the inter-view and inter-modal discrepancies as multiple image styles, and conducts feature disturbances through partially transferring body styles. By hierarchically taking the partial and global features into account, our model is capable of adaptively achieving invariant but identity-related features. Additionally, instead of establishing similarity between the query sample and each gallery sample independently, the GI module is further introduced to extract complementary information from all potential intra-class gallery samples of the given query sample, which boosts the performance on matching hard samples. Extensive experiments substantiate the superiority of our method compared with state-of-the arts.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"117 \",\"pages\":\"Article 102882\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253524006602\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524006602","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

基于视频的可见-红外人员再识别（VVI-ReID）由于视场间和模态间的巨大差异而具有挑战性。为了缓解这些差异，大多数现有作品只关注整体图像，而忽略了更多与id相关的部分信息。此外，推理决策通常基于两个样本的相似性。然而，无论是否去除模态差距，查询和图库样本之间的语义差距都不可避免地存在，因为它们的视图间不对齐。本文提出了一种层次干扰（HD）和群体推理（GI）方法来处理上述问题。具体来说，高清模块将跨视图和跨模态差异建模为多个图像样式，并通过部分转移主体样式进行特征干扰。通过分层考虑局部和全局特征，我们的模型能够自适应地获得不变但与身份相关的特征。此外，不再独立地建立查询样本和每个图库样本之间的相似性，而是进一步引入GI模块，从给定查询样本的所有潜在类内图库样本中提取互补信息，从而提高了硬样本匹配的性能。大量的实验证明我们的方法比最先进的方法优越。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hierarchical disturbance and Group Inference for video-based visible-infrared person re-identification

Video-based Visible-Infrared person Re-identification (VVI-ReID) is challenging due to the large inter-view and inter-modal discrepancies. To alleviate these discrepancies, most existing works only focus on whole images, while more id-related partial information is ignored. Furthermore, the inference decision is commonly based on the similarity of two samples. However, the semantic gap between the query and gallery samples inevitably exists due to their inter-view misalignment, no matter whether the modality-gap is removed. In this paper, we proposed a Hierarchical Disturbance (HD) and Group Inference (GI) method to handle aforementioned issues. Specifically, the HD module models the inter-view and inter-modal discrepancies as multiple image styles, and conducts feature disturbances through partially transferring body styles. By hierarchically taking the partial and global features into account, our model is capable of adaptively achieving invariant but identity-related features. Additionally, instead of establishing similarity between the query sample and each gallery sample independently, the GI module is further introduced to extract complementary information from all potential intra-class gallery samples of the given query sample, which boosts the performance on matching hard samples. Extensive experiments substantiate the superiority of our method compared with state-of-the arts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.