在航拍图像分类中通过片段注意力发现广义类别

Drones Pub Date : 2024-04-19 DOI:10.3390/drones8040160

Yifan Zhou, Haoran Zhu, Yan Zhang, Shuo Liang, Yujing Wang, Wen Yang

{"title":"在航拍图像分类中通过片段注意力发现广义类别","authors":"Yifan Zhou, Haoran Zhu, Yan Zhang, Shuo Liang, Yujing Wang, Wen Yang","doi":"10.3390/drones8040160","DOIUrl":null,"url":null,"abstract":"Aerial images record the dynamic Earth terrain, reflecting changes in land cover patterns caused by natural processes and human activities. Nonetheless, prevailing aerial image classification methodologies predominantly function within a closed-set framework, thereby encountering challenges when confronted with the identification of newly emerging scenes. To address this, this paper explores an aerial image recognition scenario in which a dataset comprises both labeled and unlabeled aerial images, intending to classify all images within the unlabeled subset, termed Generalized Category Discovery (GCD). It is noteworthy that the unlabeled images may pertain to labeled classes or represent novel classes. Specifically, we first develop a contrastive learning framework drawing upon the cutting-edge algorithms in GCD. Based on the multi-object characteristics of aerial images, we then propose a slot attention-based GCD training process (Slot-GCD) that contrasts learning at both the object and image levels. It decouples multiple local object features from feature maps using slots and then reconstructs the overall semantic feature of the image based on slot confidence scores and the feature map. Finally, these object-level and image-level features are input into the contrastive learning module to enable the model to learn more precise image semantic features. Comprehensive evaluations across three public aerial image datasets highlight the superiority of our approach over state-of-the-art methods. Particularly, Slot-GCD achieves a recognition accuracy of 91.5% for known old classes and 81.9% for unknown novel class data on the AID dataset.","PeriodicalId":507567,"journal":{"name":"Drones","volume":" 13","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalized Category Discovery in Aerial Image Classification via Slot Attention\",\"authors\":\"Yifan Zhou, Haoran Zhu, Yan Zhang, Shuo Liang, Yujing Wang, Wen Yang\",\"doi\":\"10.3390/drones8040160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aerial images record the dynamic Earth terrain, reflecting changes in land cover patterns caused by natural processes and human activities. Nonetheless, prevailing aerial image classification methodologies predominantly function within a closed-set framework, thereby encountering challenges when confronted with the identification of newly emerging scenes. To address this, this paper explores an aerial image recognition scenario in which a dataset comprises both labeled and unlabeled aerial images, intending to classify all images within the unlabeled subset, termed Generalized Category Discovery (GCD). It is noteworthy that the unlabeled images may pertain to labeled classes or represent novel classes. Specifically, we first develop a contrastive learning framework drawing upon the cutting-edge algorithms in GCD. Based on the multi-object characteristics of aerial images, we then propose a slot attention-based GCD training process (Slot-GCD) that contrasts learning at both the object and image levels. It decouples multiple local object features from feature maps using slots and then reconstructs the overall semantic feature of the image based on slot confidence scores and the feature map. Finally, these object-level and image-level features are input into the contrastive learning module to enable the model to learn more precise image semantic features. Comprehensive evaluations across three public aerial image datasets highlight the superiority of our approach over state-of-the-art methods. Particularly, Slot-GCD achieves a recognition accuracy of 91.5% for known old classes and 81.9% for unknown novel class data on the AID dataset.\",\"PeriodicalId\":507567,\"journal\":{\"name\":\"Drones\",\"volume\":\" 13\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Drones\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/drones8040160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drones","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/drones8040160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

航空图像记录了地球的动态地形，反映了自然过程和人类活动引起的土地覆盖模式的变化。然而，现有的航空图像分类方法主要在封闭集框架内运行，因此在识别新出现的场景时遇到了挑战。为解决这一问题，本文探讨了一种航空图像识别方案，即数据集由已标注和未标注的航空图像组成，旨在对未标注子集中的所有图像进行分类，即广义类别发现（GCD）。值得注意的是，未标记图像可能与已标记类别有关，也可能代表新类别。具体来说，我们首先借鉴 GCD 的前沿算法，开发了一个对比学习框架。基于航空图像的多对象特征，我们提出了一种基于槽注意的 GCD 训练过程（Slot-GCD），在对象和图像两个层面进行对比学习。它利用插槽将多个局部对象特征与特征图解耦，然后根据插槽置信度得分和特征图重建图像的整体语义特征。最后，将这些对象级和图像级特征输入对比学习模块，使模型能够学习到更精确的图像语义特征。通过对三个公共航空图像数据集的综合评估，我们的方法比最先进的方法更胜一筹。特别是，在 AID 数据集上，Slot-GCD 对已知旧类数据的识别准确率达到 91.5%，对未知新类数据的识别准确率达到 81.9%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Generalized Category Discovery in Aerial Image Classification via Slot Attention

Aerial images record the dynamic Earth terrain, reflecting changes in land cover patterns caused by natural processes and human activities. Nonetheless, prevailing aerial image classification methodologies predominantly function within a closed-set framework, thereby encountering challenges when confronted with the identification of newly emerging scenes. To address this, this paper explores an aerial image recognition scenario in which a dataset comprises both labeled and unlabeled aerial images, intending to classify all images within the unlabeled subset, termed Generalized Category Discovery (GCD). It is noteworthy that the unlabeled images may pertain to labeled classes or represent novel classes. Specifically, we first develop a contrastive learning framework drawing upon the cutting-edge algorithms in GCD. Based on the multi-object characteristics of aerial images, we then propose a slot attention-based GCD training process (Slot-GCD) that contrasts learning at both the object and image levels. It decouples multiple local object features from feature maps using slots and then reconstructs the overall semantic feature of the image based on slot confidence scores and the feature map. Finally, these object-level and image-level features are input into the contrastive learning module to enable the model to learn more precise image semantic features. Comprehensive evaluations across three public aerial image datasets highlight the superiority of our approach over state-of-the-art methods. Particularly, Slot-GCD achieves a recognition accuracy of 91.5% for known old classes and 81.9% for unknown novel class data on the AID dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Drones

自引率

0.00%

发文量