遥感场景分类中的层次多尺度表示

IF 4.4

IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society Pub Date : 2025-07-10 DOI:10.1109/LGRS.2025.3587580

João Pedro O. Batisteli;Silvio Jamil F. Guimarães;Zenilton K. G. Patrocínio

{"title":"遥感场景分类中的层次多尺度表示","authors":"João Pedro O. Batisteli;Silvio Jamil F. Guimarães;Zenilton K. G. Patrocínio","doi":"10.1109/LGRS.2025.3587580","DOIUrl":null,"url":null,"abstract":"Remote sensing scene classification (RSSC) poses significant challenges due to high spatial variability, complex textures, and semantic ambiguity in remote sensing imagery. While convolutional neural networks (CNNs) and transformer-based models have achieved notable success in this domain, their performance often depends on large-scale pretraining and substantial computational resources. Graph neural networks (GNNs) have emerged as a promising alternative to traditional deep learning methods by explicitly modeling the relational structure of image regions through graph representations, which have already demonstrated promising results across various image-based tasks involving images. In this work, we explore two GNN architectures tailored for RSSC: BRMv2, a novel simplified graph model built on a base region adjacency graph (RAG), and modified hierarchical layered multigraph network (mHELMNet), a modified hierarchical multigraph model that encodes multiscale and spatial relationships through a multigraph representation. Both models were evaluated on the EUROSAT and RESISC45 datasets, achieving accuracy comparable to, or in some cases exceeding, that of state-of-the-art CNN-based, hybrid GNN-based, and transformer-based methods, while using significantly fewer parameters and without relying on pretraining. Experimental results demonstrated that the proposed GNN models, mHELMNet and BRMv2, achieved over 96% accuracy on EUROSAT and approximately 85% on RESISC45, while requiring only 0.14% and 0.03% of the parameters of the leading transformer-based approach, respectively.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Multiscale Representation in Remote Sensing Scene Classification\",\"authors\":\"João Pedro O. Batisteli;Silvio Jamil F. Guimarães;Zenilton K. G. Patrocínio\",\"doi\":\"10.1109/LGRS.2025.3587580\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Remote sensing scene classification (RSSC) poses significant challenges due to high spatial variability, complex textures, and semantic ambiguity in remote sensing imagery. While convolutional neural networks (CNNs) and transformer-based models have achieved notable success in this domain, their performance often depends on large-scale pretraining and substantial computational resources. Graph neural networks (GNNs) have emerged as a promising alternative to traditional deep learning methods by explicitly modeling the relational structure of image regions through graph representations, which have already demonstrated promising results across various image-based tasks involving images. In this work, we explore two GNN architectures tailored for RSSC: BRMv2, a novel simplified graph model built on a base region adjacency graph (RAG), and modified hierarchical layered multigraph network (mHELMNet), a modified hierarchical multigraph model that encodes multiscale and spatial relationships through a multigraph representation. Both models were evaluated on the EUROSAT and RESISC45 datasets, achieving accuracy comparable to, or in some cases exceeding, that of state-of-the-art CNN-based, hybrid GNN-based, and transformer-based methods, while using significantly fewer parameters and without relying on pretraining. Experimental results demonstrated that the proposed GNN models, mHELMNet and BRMv2, achieved over 96% accuracy on EUROSAT and approximately 85% on RESISC45, while requiring only 0.14% and 0.03% of the parameters of the leading transformer-based approach, respectively.\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11075693/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11075693/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于遥感影像空间变异性大、纹理复杂、语义模糊等特点，遥感场景分类面临着巨大的挑战。虽然卷积神经网络（cnn）和基于变压器的模型在这一领域取得了显著的成功，但它们的性能往往依赖于大规模的预训练和大量的计算资源。图神经网络（gnn）已经成为传统深度学习方法的一种有前途的替代方法，它通过图表示显式地对图像区域的关系结构进行建模，这已经在涉及图像的各种基于图像的任务中展示了有希望的结果。在这项工作中，我们探索了为RSSC定制的两种GNN架构：BRMv2，一种基于基区邻接图（RAG）的新型简化图模型，以及改进的分层分层多图网络（mHELMNet），一种改进的分层多图模型，通过多图表示编码多尺度和空间关系。这两种模型都在EUROSAT和RESISC45数据集上进行了评估，达到了与最先进的基于cnn、基于混合gnn和基于变压器的方法相当或在某些情况下超过的精度，同时使用了更少的参数，并且不依赖于预训练。实验结果表明，所提出的GNN模型mHELMNet和BRMv2在EUROSAT上的准确率超过96%，在RESISC45上的准确率约为85%，而所需参数分别仅为基于变压器的领先方法的0.14%和0.03%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hierarchical Multiscale Representation in Remote Sensing Scene Classification

Remote sensing scene classification (RSSC) poses significant challenges due to high spatial variability, complex textures, and semantic ambiguity in remote sensing imagery. While convolutional neural networks (CNNs) and transformer-based models have achieved notable success in this domain, their performance often depends on large-scale pretraining and substantial computational resources. Graph neural networks (GNNs) have emerged as a promising alternative to traditional deep learning methods by explicitly modeling the relational structure of image regions through graph representations, which have already demonstrated promising results across various image-based tasks involving images. In this work, we explore two GNN architectures tailored for RSSC: BRMv2, a novel simplified graph model built on a base region adjacency graph (RAG), and modified hierarchical layered multigraph network (mHELMNet), a modified hierarchical multigraph model that encodes multiscale and spatial relationships through a multigraph representation. Both models were evaluated on the EUROSAT and RESISC45 datasets, achieving accuracy comparable to, or in some cases exceeding, that of state-of-the-art CNN-based, hybrid GNN-based, and transformer-based methods, while using significantly fewer parameters and without relying on pretraining. Experimental results demonstrated that the proposed GNN models, mHELMNet and BRMv2, achieved over 96% accuracy on EUROSAT and approximately 85% on RESISC45, while requiring only 0.14% and 0.03% of the parameters of the leading transformer-based approach, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society

自引率

0.00%

发文量