具有平滑损坏特征预测功能的端到端模糊人物再识别网络

IF 13.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2024-12-20 DOI:10.1007/s10462-024-11047-z

Caijie Zhao, Ying Qin, Bob Zhang, Yajie Zhao, Baoyun Wu

{"title":"具有平滑损坏特征预测功能的端到端模糊人物再识别网络","authors":"Caijie Zhao, Ying Qin, Bob Zhang, Yajie Zhao, Baoyun Wu","doi":"10.1007/s10462-024-11047-z","DOIUrl":null,"url":null,"abstract":"<div><p>Occluded person re-identification (ReID) is a challenging task as the images suffer from various obstacles and less discriminative information caused by incomplete body parts. Most current works rely on auxiliary models to infer the visible body parts and partial-level features matching to overcome the contaminated body information, which consumes extra inference time and fails when facing complex occlusions. More recently, some methods utilized masks provided from image occlusion augmentation (OA) for the supervision of mask learning. These works estimated occlusion scores for each part of the image by roughly dividing it in the horizontal direction, but cannot accurately predict the occlusion, as well as failing in vertical occlusions. To address this issue, we proposed a Smoothing Corrupted Feature Prediction (SCFP) network in an end-to-end manner for occluded person ReID. Specifically, aided by OA that simulates occlusions appearing in pedestrians and providing occlusion masks, the proposed Occlusion Decoder and Estimator (ODE) estimates and eliminates corrupted features, which is supervised by mask labels generated via restricting all occlusions into a group of patterns. We also designed an Occlusion Pattern Smoothing (OPS) to improve the performance of ODE when predicting irregular obstacles. Subsequently, a Local-to-Body (L2B) representation is constructed to mitigate the limitation of the partial body information for final matching. To investigate the performance of SCFP, we compared our model to the existing state-of-the-art methods in occluded and holistic person ReID benchmarks and proved that our method achieves superior results over the state-of-the-art methods. We also achieved the highest Rank-1 accuracies of 70.9%, 87.0%, and 93.2% in Occluded-Duke, Occluded-ReID, and P-DukeMTMC, respectively. Furthermore, the proposed SCFP generalizes well in holistic datasets, yielding accuracies of 95.8% in Market-1510 and 90.7% in DukeMTMC-reID.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 2","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-11047-z.pdf","citationCount":"0","resultStr":"{\"title\":\"An end-to-end occluded person re-identification network with smoothing corrupted feature prediction\",\"authors\":\"Caijie Zhao, Ying Qin, Bob Zhang, Yajie Zhao, Baoyun Wu\",\"doi\":\"10.1007/s10462-024-11047-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Occluded person re-identification (ReID) is a challenging task as the images suffer from various obstacles and less discriminative information caused by incomplete body parts. Most current works rely on auxiliary models to infer the visible body parts and partial-level features matching to overcome the contaminated body information, which consumes extra inference time and fails when facing complex occlusions. More recently, some methods utilized masks provided from image occlusion augmentation (OA) for the supervision of mask learning. These works estimated occlusion scores for each part of the image by roughly dividing it in the horizontal direction, but cannot accurately predict the occlusion, as well as failing in vertical occlusions. To address this issue, we proposed a Smoothing Corrupted Feature Prediction (SCFP) network in an end-to-end manner for occluded person ReID. Specifically, aided by OA that simulates occlusions appearing in pedestrians and providing occlusion masks, the proposed Occlusion Decoder and Estimator (ODE) estimates and eliminates corrupted features, which is supervised by mask labels generated via restricting all occlusions into a group of patterns. We also designed an Occlusion Pattern Smoothing (OPS) to improve the performance of ODE when predicting irregular obstacles. Subsequently, a Local-to-Body (L2B) representation is constructed to mitigate the limitation of the partial body information for final matching. To investigate the performance of SCFP, we compared our model to the existing state-of-the-art methods in occluded and holistic person ReID benchmarks and proved that our method achieves superior results over the state-of-the-art methods. We also achieved the highest Rank-1 accuracies of 70.9%, 87.0%, and 93.2% in Occluded-Duke, Occluded-ReID, and P-DukeMTMC, respectively. Furthermore, the proposed SCFP generalizes well in holistic datasets, yielding accuracies of 95.8% in Market-1510 and 90.7% in DukeMTMC-reID.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"58 2\",\"pages\":\"\"},\"PeriodicalIF\":13.9000,\"publicationDate\":\"2024-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10462-024-11047-z.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-024-11047-z\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-11047-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

隐蔽人物再识别（ReID）是一项具有挑战性的任务，因为图像中存在各种障碍物，而且由于身体部位不完整，导致识别信息较少。目前的大多数工作都依赖于辅助模型来推断可见的身体部位，并通过部分级别的特征匹配来克服受污染的身体信息，但这需要消耗额外的推断时间，而且在面对复杂的遮挡时会失效。最近，一些方法利用图像遮挡增强（OA）提供的遮挡来监督遮挡学习。这些方法通过在水平方向上对图像的每个部分进行粗略分割来估算闭塞得分，但无法准确预测闭塞情况，而且在垂直闭塞情况下也会失效。为解决这一问题，我们提出了一种端到端的平滑损坏特征预测（SCFP）网络，用于闭塞人员 ReID。具体来说，在模拟行人中出现的遮挡物并提供遮挡掩码的 OA 的辅助下，所提出的遮挡解码器和估计器（ODE）会估计并消除损坏的特征，这是在通过将所有遮挡物限制为一组模式而生成的掩码标签的监督下完成的。我们还设计了一种遮挡模式平滑（OPS），以提高 ODE 在预测不规则障碍物时的性能。随后，我们构建了局部到身体（L2B）表示法，以减轻部分身体信息对最终匹配的限制。为了研究 SCFP 的性能，我们在隐蔽和整体人物 ReID 基准中将我们的模型与现有的先进方法进行了比较，结果证明我们的方法比先进方法取得了更好的效果。我们还在 Occluded-Duke、Occluded-ReID 和 P-DukeMTMC 中分别取得了 70.9%、87.0% 和 93.2% 的最高 Rank-1 准确率。此外，提出的 SCFP 在整体数据集中具有良好的通用性，在 Market-1510 和 DukeMTMC-reID 中的准确率分别为 95.8% 和 90.7%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An end-to-end occluded person re-identification network with smoothing corrupted feature prediction

Occluded person re-identification (ReID) is a challenging task as the images suffer from various obstacles and less discriminative information caused by incomplete body parts. Most current works rely on auxiliary models to infer the visible body parts and partial-level features matching to overcome the contaminated body information, which consumes extra inference time and fails when facing complex occlusions. More recently, some methods utilized masks provided from image occlusion augmentation (OA) for the supervision of mask learning. These works estimated occlusion scores for each part of the image by roughly dividing it in the horizontal direction, but cannot accurately predict the occlusion, as well as failing in vertical occlusions. To address this issue, we proposed a Smoothing Corrupted Feature Prediction (SCFP) network in an end-to-end manner for occluded person ReID. Specifically, aided by OA that simulates occlusions appearing in pedestrians and providing occlusion masks, the proposed Occlusion Decoder and Estimator (ODE) estimates and eliminates corrupted features, which is supervised by mask labels generated via restricting all occlusions into a group of patterns. We also designed an Occlusion Pattern Smoothing (OPS) to improve the performance of ODE when predicting irregular obstacles. Subsequently, a Local-to-Body (L2B) representation is constructed to mitigate the limitation of the partial body information for final matching. To investigate the performance of SCFP, we compared our model to the existing state-of-the-art methods in occluded and holistic person ReID benchmarks and proved that our method achieves superior results over the state-of-the-art methods. We also achieved the highest Rank-1 accuracies of 70.9%, 87.0%, and 93.2% in Occluded-Duke, Occluded-ReID, and P-DukeMTMC, respectively. Furthermore, the proposed SCFP generalizes well in holistic datasets, yielding accuracies of 95.8% in Market-1510 and 90.7% in DukeMTMC-reID.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.