Xinmiao Dai, Chong Wang, Haohe Li, Sunqi Lin, Lining Dong, Jiafei Wu, Jun Wang
{"title":"零射击目标检测的综合特征评估","authors":"Xinmiao Dai, Chong Wang, Haohe Li, Sunqi Lin, Lining Dong, Jiafei Wu, Jun Wang","doi":"10.1109/ICME55011.2023.00083","DOIUrl":null,"url":null,"abstract":"Zero-shot object detection aims to simultaneously identify and localize classes that were not presented during training. Many generative model-based methods have shown promising performance by synthesizing the visual features of unseen classes from semantic embeddings. However, these synthetic features are inevitably of varied quality, which may be far from the ground truth. It degrades the performance of trained unseen classifier. Instead of tweaking the generative model, a new idea of feature quality assessment is proposed to utilize both the good and bad features to optimize the classifier in the right direction. Moreover, contrastive learning is also introduced to enhance the feature uniqueness between unseen and seen classes, which helps the feature assessment implicitly. To demonstrate the effectiveness of the proposed algorithm, comprehensive experiments are conducted on the MS COCO dataset and PASCAL VOC dataset, the state-of-the-art performance is achieved. Our code is available at: https://github.com/Dai1029/SFA-ZSD.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Synthetic Feature Assessment for Zero-Shot Object Detection\",\"authors\":\"Xinmiao Dai, Chong Wang, Haohe Li, Sunqi Lin, Lining Dong, Jiafei Wu, Jun Wang\",\"doi\":\"10.1109/ICME55011.2023.00083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Zero-shot object detection aims to simultaneously identify and localize classes that were not presented during training. Many generative model-based methods have shown promising performance by synthesizing the visual features of unseen classes from semantic embeddings. However, these synthetic features are inevitably of varied quality, which may be far from the ground truth. It degrades the performance of trained unseen classifier. Instead of tweaking the generative model, a new idea of feature quality assessment is proposed to utilize both the good and bad features to optimize the classifier in the right direction. Moreover, contrastive learning is also introduced to enhance the feature uniqueness between unseen and seen classes, which helps the feature assessment implicitly. To demonstrate the effectiveness of the proposed algorithm, comprehensive experiments are conducted on the MS COCO dataset and PASCAL VOC dataset, the state-of-the-art performance is achieved. Our code is available at: https://github.com/Dai1029/SFA-ZSD.\",\"PeriodicalId\":321830,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME55011.2023.00083\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME55011.2023.00083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Synthetic Feature Assessment for Zero-Shot Object Detection
Zero-shot object detection aims to simultaneously identify and localize classes that were not presented during training. Many generative model-based methods have shown promising performance by synthesizing the visual features of unseen classes from semantic embeddings. However, these synthetic features are inevitably of varied quality, which may be far from the ground truth. It degrades the performance of trained unseen classifier. Instead of tweaking the generative model, a new idea of feature quality assessment is proposed to utilize both the good and bad features to optimize the classifier in the right direction. Moreover, contrastive learning is also introduced to enhance the feature uniqueness between unseen and seen classes, which helps the feature assessment implicitly. To demonstrate the effectiveness of the proposed algorithm, comprehensive experiments are conducted on the MS COCO dataset and PASCAL VOC dataset, the state-of-the-art performance is achieved. Our code is available at: https://github.com/Dai1029/SFA-ZSD.