{"title":"基于跨模态属性的人物检索交互式框架","authors":"Andreas Specker, Arne Schumann, J. Beyerer","doi":"10.1109/AVSS.2019.8909832","DOIUrl":null,"url":null,"abstract":"Person re-identification systems generally rely on a query person image to find additional occurrences of this person across a camera network. In many real-world situations, however, no such query image is available and witness testimony is the only clue upon which to base a search. Cross-modal re-identification based on attribute queries can help in such cases but currently yields a low matching accuracy which is often not sufficient for practical applications. In this work we propose an interactive feedback-driven framework, which successfully bridges the modality gap and achieves a significant increase in accuracy by 47% in mean average precision (mAP) compared to the fully automatic cross-modal state-of-the-art. We further propose a cluster-based feedback method as part of the framework, which outperforms naïve user feedback by more than 9% mAP. Our results set a new state-of-the-art for fully automatic and feedback-driven cross-modal attribute-based re-identification on two public datasets.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An Interactive Framework for Cross-modal Attribute-based Person Retrieval\",\"authors\":\"Andreas Specker, Arne Schumann, J. Beyerer\",\"doi\":\"10.1109/AVSS.2019.8909832\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Person re-identification systems generally rely on a query person image to find additional occurrences of this person across a camera network. In many real-world situations, however, no such query image is available and witness testimony is the only clue upon which to base a search. Cross-modal re-identification based on attribute queries can help in such cases but currently yields a low matching accuracy which is often not sufficient for practical applications. In this work we propose an interactive feedback-driven framework, which successfully bridges the modality gap and achieves a significant increase in accuracy by 47% in mean average precision (mAP) compared to the fully automatic cross-modal state-of-the-art. We further propose a cluster-based feedback method as part of the framework, which outperforms naïve user feedback by more than 9% mAP. Our results set a new state-of-the-art for fully automatic and feedback-driven cross-modal attribute-based re-identification on two public datasets.\",\"PeriodicalId\":243194,\"journal\":{\"name\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS.2019.8909832\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2019.8909832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Interactive Framework for Cross-modal Attribute-based Person Retrieval
Person re-identification systems generally rely on a query person image to find additional occurrences of this person across a camera network. In many real-world situations, however, no such query image is available and witness testimony is the only clue upon which to base a search. Cross-modal re-identification based on attribute queries can help in such cases but currently yields a low matching accuracy which is often not sufficient for practical applications. In this work we propose an interactive feedback-driven framework, which successfully bridges the modality gap and achieves a significant increase in accuracy by 47% in mean average precision (mAP) compared to the fully automatic cross-modal state-of-the-art. We further propose a cluster-based feedback method as part of the framework, which outperforms naïve user feedback by more than 9% mAP. Our results set a new state-of-the-art for fully automatic and feedback-driven cross-modal attribute-based re-identification on two public datasets.