{"title":"利用光谱-空间轴向聚合变换器提升高光谱图像分类的泛化能力","authors":"Enzhe Zhao;Zhichang Guo;Shengzhu Shi;Yao Li;Jia Li;Dazhi Zhang","doi":"10.1109/TGRS.2024.3463187","DOIUrl":null,"url":null,"abstract":"In the hyperspectral image classification (HSIC) task, the most commonly used model validation paradigm is partitioning the training-test dataset through pixelwise random sampling. By training on a small amount of data, the deep learning model can achieve almost perfect accuracy. However, in our experiments, we found that the high accuracy was reached because the training and test datasets share a lot of information. On nonoverlapping dataset partitions, well-performing models suffer significant performance degradation. To this end, we propose a spectral-spatial axial aggregation transformer model, namely, SaaFormer, which preserves generalization across dataset partitions. SaaFormer applies a multilevel spectral extraction structure to segment the spectrum into multiple spectrum clips such that the wavelength continuity of the spectrum across the channel is preserved. For each spectrum clip, the axial aggregation attention mechanism, which integrates spatial features along multiple spectral axes, is applied to mine the spectral characteristic. The multilevel spectral extraction and the axial aggregation attention emphasize spectral characteristics to improve the model generalization. The experimental results on five publicly available datasets demonstrate that our model exhibits comparable performance on the random partition while significantly outperforming other methods on nonoverlapping partitions. Moreover, SaaFormer shows excellent performance on background classification.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Boosting the Generalization Ability for Hyperspectral Image Classification Using Spectral-Spatial Axial Aggregation Transformer\",\"authors\":\"Enzhe Zhao;Zhichang Guo;Shengzhu Shi;Yao Li;Jia Li;Dazhi Zhang\",\"doi\":\"10.1109/TGRS.2024.3463187\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the hyperspectral image classification (HSIC) task, the most commonly used model validation paradigm is partitioning the training-test dataset through pixelwise random sampling. By training on a small amount of data, the deep learning model can achieve almost perfect accuracy. However, in our experiments, we found that the high accuracy was reached because the training and test datasets share a lot of information. On nonoverlapping dataset partitions, well-performing models suffer significant performance degradation. To this end, we propose a spectral-spatial axial aggregation transformer model, namely, SaaFormer, which preserves generalization across dataset partitions. SaaFormer applies a multilevel spectral extraction structure to segment the spectrum into multiple spectrum clips such that the wavelength continuity of the spectrum across the channel is preserved. For each spectrum clip, the axial aggregation attention mechanism, which integrates spatial features along multiple spectral axes, is applied to mine the spectral characteristic. The multilevel spectral extraction and the axial aggregation attention emphasize spectral characteristics to improve the model generalization. The experimental results on five publicly available datasets demonstrate that our model exhibits comparable performance on the random partition while significantly outperforming other methods on nonoverlapping partitions. Moreover, SaaFormer shows excellent performance on background classification.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10683772/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10683772/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Boosting the Generalization Ability for Hyperspectral Image Classification Using Spectral-Spatial Axial Aggregation Transformer
In the hyperspectral image classification (HSIC) task, the most commonly used model validation paradigm is partitioning the training-test dataset through pixelwise random sampling. By training on a small amount of data, the deep learning model can achieve almost perfect accuracy. However, in our experiments, we found that the high accuracy was reached because the training and test datasets share a lot of information. On nonoverlapping dataset partitions, well-performing models suffer significant performance degradation. To this end, we propose a spectral-spatial axial aggregation transformer model, namely, SaaFormer, which preserves generalization across dataset partitions. SaaFormer applies a multilevel spectral extraction structure to segment the spectrum into multiple spectrum clips such that the wavelength continuity of the spectrum across the channel is preserved. For each spectrum clip, the axial aggregation attention mechanism, which integrates spatial features along multiple spectral axes, is applied to mine the spectral characteristic. The multilevel spectral extraction and the axial aggregation attention emphasize spectral characteristics to improve the model generalization. The experimental results on five publicly available datasets demonstrate that our model exhibits comparable performance on the random partition while significantly outperforming other methods on nonoverlapping partitions. Moreover, SaaFormer shows excellent performance on background classification.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.