Hanqing Zheng;Yuxuan Shi;Hefei Ling;Zongyi Li;Runsheng Wang;Zhongyang Li;Ping Li
{"title":"基于不确定性的级联变换推理用于被排除人员的再识别","authors":"Hanqing Zheng;Yuxuan Shi;Hefei Ling;Zongyi Li;Runsheng Wang;Zhongyang Li;Ping Li","doi":"10.1109/TBIOM.2024.3361677","DOIUrl":null,"url":null,"abstract":"Occluded person re-identification is a challenging task due to various noise introduced by occlusion. Previous methods utilize body detectors to exploit more clues which are overdependent on accuracy of detection results. In this paper, we propose a model named Cascade Transformer Reasoning Embedded by Uncertainty Network (CTU) which does not require external information. Self-attention of the transformer models long-range dependency to capture difference between pixels, which helps the model focus on discriminative information of human bodies. However, noise such as occlusion will bring a high level of uncertainty to feature learning and makes self-attention learn undesirable dependency. We invent a novel structure named Uncertainty Embedded Transformer (UT) Layer to involve uncertainty in computing attention weights of self-attention. Introducing uncertainty mechanism helps the network better evaluate the dependency between pixels and focus more on human bodies. Additionally, our proposed transformer layer generates an attention mask through Cascade Attention Module (CA) to guide the next layer to focus more on key areas of the feature map, decomposing feature learning into cascade stages. Extensive experiments over challenging datasets Occluded-DukeMTMC, P-DukeMTMC, etc., verify the effectiveness of our method.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 2","pages":"219-229"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cascade Transformer Reasoning Embedded by Uncertainty for Occluded Person Re-Identification\",\"authors\":\"Hanqing Zheng;Yuxuan Shi;Hefei Ling;Zongyi Li;Runsheng Wang;Zhongyang Li;Ping Li\",\"doi\":\"10.1109/TBIOM.2024.3361677\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Occluded person re-identification is a challenging task due to various noise introduced by occlusion. Previous methods utilize body detectors to exploit more clues which are overdependent on accuracy of detection results. In this paper, we propose a model named Cascade Transformer Reasoning Embedded by Uncertainty Network (CTU) which does not require external information. Self-attention of the transformer models long-range dependency to capture difference between pixels, which helps the model focus on discriminative information of human bodies. However, noise such as occlusion will bring a high level of uncertainty to feature learning and makes self-attention learn undesirable dependency. We invent a novel structure named Uncertainty Embedded Transformer (UT) Layer to involve uncertainty in computing attention weights of self-attention. Introducing uncertainty mechanism helps the network better evaluate the dependency between pixels and focus more on human bodies. Additionally, our proposed transformer layer generates an attention mask through Cascade Attention Module (CA) to guide the next layer to focus more on key areas of the feature map, decomposing feature learning into cascade stages. Extensive experiments over challenging datasets Occluded-DukeMTMC, P-DukeMTMC, etc., verify the effectiveness of our method.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"6 2\",\"pages\":\"219-229\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10418989/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10418989/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cascade Transformer Reasoning Embedded by Uncertainty for Occluded Person Re-Identification
Occluded person re-identification is a challenging task due to various noise introduced by occlusion. Previous methods utilize body detectors to exploit more clues which are overdependent on accuracy of detection results. In this paper, we propose a model named Cascade Transformer Reasoning Embedded by Uncertainty Network (CTU) which does not require external information. Self-attention of the transformer models long-range dependency to capture difference between pixels, which helps the model focus on discriminative information of human bodies. However, noise such as occlusion will bring a high level of uncertainty to feature learning and makes self-attention learn undesirable dependency. We invent a novel structure named Uncertainty Embedded Transformer (UT) Layer to involve uncertainty in computing attention weights of self-attention. Introducing uncertainty mechanism helps the network better evaluate the dependency between pixels and focus more on human bodies. Additionally, our proposed transformer layer generates an attention mask through Cascade Attention Module (CA) to guide the next layer to focus more on key areas of the feature map, decomposing feature learning into cascade stages. Extensive experiments over challenging datasets Occluded-DukeMTMC, P-DukeMTMC, etc., verify the effectiveness of our method.