从3D角度概括的人物再识别：解决不可预测的观点变化

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2025-06-30 DOI:10.1109/TIFS.2025.3583900

Bingliang Jiao;Lingqiao Liu;Liying Gao;Dapeng Oliver Wu;Guosheng Lin;Peng Wang;Yanning Zhang

{"title":"从3D角度概括的人物再识别：解决不可预测的观点变化","authors":"Bingliang Jiao;Lingqiao Liu;Liying Gao;Dapeng Oliver Wu;Guosheng Lin;Peng Wang;Yanning Zhang","doi":"10.1109/TIFS.2025.3583900","DOIUrl":null,"url":null,"abstract":"Most existing Domain Generalizable Person Re-identification (DG-ReID) methods focus on addressing style disparities between domains but often overlook the impact of unpredictable camera view changes, which we have identified as a significant factor responsible for poor generalization performance. To address this issue, we propose a novel approach from a 3D perspective, utilizing a customized 2D-to-3D reconstruction model to convert images captured from arbitrary camera views into canonical view images. However, merely applying a 3D reconstruction model in isolation may not result in improved DG-ReID performance, as reconstruction quality can be influenced by multiple factors, such as insufficient image resolution, extreme viewpoint, and environmental variations. These factors may lead to error accumulation and the loss of critical discriminative clues in the reconstructed results. To address this difficulty, we propose fusing the canonical view image with the original image using a transformer-based module. The transformer’s cross-attention mechanism is ideal for aligning and fusing the key semantic clues of the original image with the canonical view image, compensating for reconstruction errors. We demonstrate the effectiveness of our method through extensive experiments in various evaluation settings, achieving superior DG-ReID performance compared to existing approaches. Our approach addresses the impact of unpredictable camera view changes and provides a new perspective for designing DG-ReID methods.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"6576-6591"},"PeriodicalIF":8.0000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalizable Person Re-Identification From a 3D Perspective: Addressing Unpredictable Viewpoint Changes\",\"authors\":\"Bingliang Jiao;Lingqiao Liu;Liying Gao;Dapeng Oliver Wu;Guosheng Lin;Peng Wang;Yanning Zhang\",\"doi\":\"10.1109/TIFS.2025.3583900\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most existing Domain Generalizable Person Re-identification (DG-ReID) methods focus on addressing style disparities between domains but often overlook the impact of unpredictable camera view changes, which we have identified as a significant factor responsible for poor generalization performance. To address this issue, we propose a novel approach from a 3D perspective, utilizing a customized 2D-to-3D reconstruction model to convert images captured from arbitrary camera views into canonical view images. However, merely applying a 3D reconstruction model in isolation may not result in improved DG-ReID performance, as reconstruction quality can be influenced by multiple factors, such as insufficient image resolution, extreme viewpoint, and environmental variations. These factors may lead to error accumulation and the loss of critical discriminative clues in the reconstructed results. To address this difficulty, we propose fusing the canonical view image with the original image using a transformer-based module. The transformer’s cross-attention mechanism is ideal for aligning and fusing the key semantic clues of the original image with the canonical view image, compensating for reconstruction errors. We demonstrate the effectiveness of our method through extensive experiments in various evaluation settings, achieving superior DG-ReID performance compared to existing approaches. Our approach addresses the impact of unpredictable camera view changes and provides a new perspective for designing DG-ReID methods.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"6576-6591\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11059300/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11059300/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

大多数现有的领域可泛化人员重新识别（DG-ReID）方法侧重于解决领域之间的风格差异，但往往忽略了不可预测的相机视图变化的影响，这是导致泛化性能差的一个重要因素。为了解决这个问题，我们从3D角度提出了一种新的方法，利用自定义的2d到3D重建模型将从任意相机视图捕获的图像转换为规范视图图像。然而，仅仅孤立地应用3D重建模型可能无法改善DG-ReID的性能，因为重建质量可能受到多种因素的影响，例如图像分辨率不足、极端视角和环境变化。这些因素可能导致重构结果中误差的累积和关键判别线索的丢失。为了解决这一困难，我们提出使用基于转换器的模块将规范视图图像与原始图像融合。转换器的交叉注意机制非常适合将原始图像的关键语义线索与规范视图图像对齐和融合，从而补偿重建误差。我们通过各种评估设置的大量实验证明了我们方法的有效性，与现有方法相比，实现了优越的DG-ReID性能。我们的方法解决了不可预测的相机视图变化的影响，并为设计DG-ReID方法提供了新的视角。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Generalizable Person Re-Identification From a 3D Perspective: Addressing Unpredictable Viewpoint Changes

Most existing Domain Generalizable Person Re-identification (DG-ReID) methods focus on addressing style disparities between domains but often overlook the impact of unpredictable camera view changes, which we have identified as a significant factor responsible for poor generalization performance. To address this issue, we propose a novel approach from a 3D perspective, utilizing a customized 2D-to-3D reconstruction model to convert images captured from arbitrary camera views into canonical view images. However, merely applying a 3D reconstruction model in isolation may not result in improved DG-ReID performance, as reconstruction quality can be influenced by multiple factors, such as insufficient image resolution, extreme viewpoint, and environmental variations. These factors may lead to error accumulation and the loss of critical discriminative clues in the reconstructed results. To address this difficulty, we propose fusing the canonical view image with the original image using a transformer-based module. The transformer’s cross-attention mechanism is ideal for aligning and fusing the key semantic clues of the original image with the canonical view image, compensating for reconstruction errors. We demonstrate the effectiveness of our method through extensive experiments in various evaluation settings, achieving superior DG-ReID performance compared to existing approaches. Our approach addresses the impact of unpredictable camera view changes and provides a new perspective for designing DG-ReID methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features