{"title":"基于视频的精细注意力网络的人物再识别","authors":"Tanzila Rahman, Mrigank Rochan, Yang Wang","doi":"10.1109/AVSS.2019.8909869","DOIUrl":null,"url":null,"abstract":"We consider the problem of video-based person reidentification. The goal is to identify a person from videos captured under different cameras. In this paper, we propose an efficient attention based model for person re-identifying from videos. Our method generates an attention score for each frame based on frame-level features. The attention scores of all frames in a video are used to produce a weighted feature vector for the input video. This video-level feature vector is refined iteratively for re-identifying persons from videos. Unlike most existing deep learning methods that use global or spatial representation, our approach focuses on attention scores. Extensive experiments on three benchmark datasets demonstrate that our method achieves the state-of-the-art performance.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Video-Based Person Re-Identification using Refined Attention Networks\",\"authors\":\"Tanzila Rahman, Mrigank Rochan, Yang Wang\",\"doi\":\"10.1109/AVSS.2019.8909869\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of video-based person reidentification. The goal is to identify a person from videos captured under different cameras. In this paper, we propose an efficient attention based model for person re-identifying from videos. Our method generates an attention score for each frame based on frame-level features. The attention scores of all frames in a video are used to produce a weighted feature vector for the input video. This video-level feature vector is refined iteratively for re-identifying persons from videos. Unlike most existing deep learning methods that use global or spatial representation, our approach focuses on attention scores. Extensive experiments on three benchmark datasets demonstrate that our method achieves the state-of-the-art performance.\",\"PeriodicalId\":243194,\"journal\":{\"name\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS.2019.8909869\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2019.8909869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Video-Based Person Re-Identification using Refined Attention Networks
We consider the problem of video-based person reidentification. The goal is to identify a person from videos captured under different cameras. In this paper, we propose an efficient attention based model for person re-identifying from videos. Our method generates an attention score for each frame based on frame-level features. The attention scores of all frames in a video are used to produce a weighted feature vector for the input video. This video-level feature vector is refined iteratively for re-identifying persons from videos. Unlike most existing deep learning methods that use global or spatial representation, our approach focuses on attention scores. Extensive experiments on three benchmark datasets demonstrate that our method achieves the state-of-the-art performance.