基于不确定性的级联变换推理用于被排除人员的再识别

IF 5

IEEE transactions on biometrics, behavior, and identity science Pub Date : 2024-02-02 DOI:10.1109/TBIOM.2024.3361677

Hanqing Zheng;Yuxuan Shi;Hefei Ling;Zongyi Li;Runsheng Wang;Zhongyang Li;Ping Li

{"title":"基于不确定性的级联变换推理用于被排除人员的再识别","authors":"Hanqing Zheng;Yuxuan Shi;Hefei Ling;Zongyi Li;Runsheng Wang;Zhongyang Li;Ping Li","doi":"10.1109/TBIOM.2024.3361677","DOIUrl":null,"url":null,"abstract":"Occluded person re-identification is a challenging task due to various noise introduced by occlusion. Previous methods utilize body detectors to exploit more clues which are overdependent on accuracy of detection results. In this paper, we propose a model named Cascade Transformer Reasoning Embedded by Uncertainty Network (CTU) which does not require external information. Self-attention of the transformer models long-range dependency to capture difference between pixels, which helps the model focus on discriminative information of human bodies. However, noise such as occlusion will bring a high level of uncertainty to feature learning and makes self-attention learn undesirable dependency. We invent a novel structure named Uncertainty Embedded Transformer (UT) Layer to involve uncertainty in computing attention weights of self-attention. Introducing uncertainty mechanism helps the network better evaluate the dependency between pixels and focus more on human bodies. Additionally, our proposed transformer layer generates an attention mask through Cascade Attention Module (CA) to guide the next layer to focus more on key areas of the feature map, decomposing feature learning into cascade stages. Extensive experiments over challenging datasets Occluded-DukeMTMC, P-DukeMTMC, etc., verify the effectiveness of our method.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 2","pages":"219-229"},"PeriodicalIF":5.0000,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cascade Transformer Reasoning Embedded by Uncertainty for Occluded Person Re-Identification\",\"authors\":\"Hanqing Zheng;Yuxuan Shi;Hefei Ling;Zongyi Li;Runsheng Wang;Zhongyang Li;Ping Li\",\"doi\":\"10.1109/TBIOM.2024.3361677\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Occluded person re-identification is a challenging task due to various noise introduced by occlusion. Previous methods utilize body detectors to exploit more clues which are overdependent on accuracy of detection results. In this paper, we propose a model named Cascade Transformer Reasoning Embedded by Uncertainty Network (CTU) which does not require external information. Self-attention of the transformer models long-range dependency to capture difference between pixels, which helps the model focus on discriminative information of human bodies. However, noise such as occlusion will bring a high level of uncertainty to feature learning and makes self-attention learn undesirable dependency. We invent a novel structure named Uncertainty Embedded Transformer (UT) Layer to involve uncertainty in computing attention weights of self-attention. Introducing uncertainty mechanism helps the network better evaluate the dependency between pixels and focus more on human bodies. Additionally, our proposed transformer layer generates an attention mask through Cascade Attention Module (CA) to guide the next layer to focus more on key areas of the feature map, decomposing feature learning into cascade stages. Extensive experiments over challenging datasets Occluded-DukeMTMC, P-DukeMTMC, etc., verify the effectiveness of our method.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"6 2\",\"pages\":\"219-229\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10418989/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10418989/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于遮挡带来的各种噪音，重新识别被遮挡者是一项具有挑战性的任务。以往的方法利用人体检测器来利用更多线索，但这过度依赖于检测结果的准确性。在本文中，我们提出了一种名为 "嵌入不确定性网络的级联变换推理（CTU）"的模型，它不需要外部信息。变换器的自注意模型可捕捉像素之间的长距离依赖性差异，这有助于模型专注于人体的鉴别信息。然而，遮挡等噪声会给特征学习带来很大的不确定性，并使自注意学习到不理想的依赖性。我们发明了一种名为 "不确定性嵌入变换器（UT）层 "的新结构，在计算自我注意的注意权重时引入不确定性。引入不确定性机制有助于网络更好地评估像素之间的依赖性，并更多地关注人体。此外，我们提出的转换器层通过级联注意力模块（CA）生成注意力掩码，引导下一层更加关注特征图的关键区域，将特征学习分解为级联阶段。在具有挑战性的数据集 Occluded-DukeMTMC、P-DukeMTMC 等上进行的大量实验验证了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cascade Transformer Reasoning Embedded by Uncertainty for Occluded Person Re-Identification

Occluded person re-identification is a challenging task due to various noise introduced by occlusion. Previous methods utilize body detectors to exploit more clues which are overdependent on accuracy of detection results. In this paper, we propose a model named Cascade Transformer Reasoning Embedded by Uncertainty Network (CTU) which does not require external information. Self-attention of the transformer models long-range dependency to capture difference between pixels, which helps the model focus on discriminative information of human bodies. However, noise such as occlusion will bring a high level of uncertainty to feature learning and makes self-attention learn undesirable dependency. We invent a novel structure named Uncertainty Embedded Transformer (UT) Layer to involve uncertainty in computing attention weights of self-attention. Introducing uncertainty mechanism helps the network better evaluate the dependency between pixels and focus more on human bodies. Additionally, our proposed transformer layer generates an attention mask through Cascade Attention Module (CA) to guide the next layer to focus more on key areas of the feature map, decomposing feature learning into cascade stages. Extensive experiments over challenging datasets Occluded-DukeMTMC, P-DukeMTMC, etc., verify the effectiveness of our method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量