Shujie Ding;Xiaoli Ruan;Jing Yang;Chengjiang Li;Jie Sun;Xianghong Tang;Zhidong Su
{"title":"用于高光谱图像分类的光谱-空间卷积融合远程依赖转换网络","authors":"Shujie Ding;Xiaoli Ruan;Jing Yang;Chengjiang Li;Jie Sun;Xianghong Tang;Zhidong Su","doi":"10.1109/TGRS.2024.3510625","DOIUrl":null,"url":null,"abstract":"Recently, deep learning has achieved remarkable breakthroughs in hyperspectral image (HSI) classification tasks, particularly with methods based on convolutional neural networks (CNNs) and transformers. However, these methods have several limitations: 1) the limited receptive field inherent in the convolutional layer greatly hampers capturing feature contextual information on a large scale and 2) transformers cannot establish strong local relationships, making it challenging to characterize complex dependencies between distant pixels and different bands in HSIs. Moreover, as the network complexity increases, so does the number of network parameters. We propose a novel network called the spectral-spatial convolutional fusion long-range dependence transformer network (LRDTN) for HSI classification to address these challenges. LRDTN comprises three key components: dynamic-dependent convolutional (DDC) module, the multiscale enhanced fusion (MsEF) module, and the local–global perception transformer (LGPT). Specifically, the DDC dynamically models local features, while the MsEF integrates information from different scales to capture contextual relationships in HSI features effectively. Additionally, the ability to mine and utilize HSI local-global features and complex long-range dependencies is enhanced by the proposed transformer variant, LGPT. Ultimately, through the ingeniously designed structure of the LRDTN, the model effectively maintains its performance while reducing the number of network parameters. Extensive experiments conducted on four typical HSI datasets, including urban areas, agricultural areas, and swamps, demonstrate the superiority of LRDTN over other state-of-the-art networks. The code is available at \n<uri>https://github.com/ybyangjing/LRDTN</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-21"},"PeriodicalIF":7.5000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LRDTN: Spectral–Spatial Convolutional Fusion Long-Range Dependence Transformer Network for Hyperspectral Image Classification\",\"authors\":\"Shujie Ding;Xiaoli Ruan;Jing Yang;Chengjiang Li;Jie Sun;Xianghong Tang;Zhidong Su\",\"doi\":\"10.1109/TGRS.2024.3510625\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, deep learning has achieved remarkable breakthroughs in hyperspectral image (HSI) classification tasks, particularly with methods based on convolutional neural networks (CNNs) and transformers. However, these methods have several limitations: 1) the limited receptive field inherent in the convolutional layer greatly hampers capturing feature contextual information on a large scale and 2) transformers cannot establish strong local relationships, making it challenging to characterize complex dependencies between distant pixels and different bands in HSIs. Moreover, as the network complexity increases, so does the number of network parameters. We propose a novel network called the spectral-spatial convolutional fusion long-range dependence transformer network (LRDTN) for HSI classification to address these challenges. LRDTN comprises three key components: dynamic-dependent convolutional (DDC) module, the multiscale enhanced fusion (MsEF) module, and the local–global perception transformer (LGPT). Specifically, the DDC dynamically models local features, while the MsEF integrates information from different scales to capture contextual relationships in HSI features effectively. Additionally, the ability to mine and utilize HSI local-global features and complex long-range dependencies is enhanced by the proposed transformer variant, LGPT. Ultimately, through the ingeniously designed structure of the LRDTN, the model effectively maintains its performance while reducing the number of network parameters. Extensive experiments conducted on four typical HSI datasets, including urban areas, agricultural areas, and swamps, demonstrate the superiority of LRDTN over other state-of-the-art networks. The code is available at \\n<uri>https://github.com/ybyangjing/LRDTN</uri>\\n.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"63 \",\"pages\":\"1-21\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10772646/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10772646/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Recently, deep learning has achieved remarkable breakthroughs in hyperspectral image (HSI) classification tasks, particularly with methods based on convolutional neural networks (CNNs) and transformers. However, these methods have several limitations: 1) the limited receptive field inherent in the convolutional layer greatly hampers capturing feature contextual information on a large scale and 2) transformers cannot establish strong local relationships, making it challenging to characterize complex dependencies between distant pixels and different bands in HSIs. Moreover, as the network complexity increases, so does the number of network parameters. We propose a novel network called the spectral-spatial convolutional fusion long-range dependence transformer network (LRDTN) for HSI classification to address these challenges. LRDTN comprises three key components: dynamic-dependent convolutional (DDC) module, the multiscale enhanced fusion (MsEF) module, and the local–global perception transformer (LGPT). Specifically, the DDC dynamically models local features, while the MsEF integrates information from different scales to capture contextual relationships in HSI features effectively. Additionally, the ability to mine and utilize HSI local-global features and complex long-range dependencies is enhanced by the proposed transformer variant, LGPT. Ultimately, through the ingeniously designed structure of the LRDTN, the model effectively maintains its performance while reducing the number of network parameters. Extensive experiments conducted on four typical HSI datasets, including urban areas, agricultural areas, and swamps, demonstrate the superiority of LRDTN over other state-of-the-art networks. The code is available at
https://github.com/ybyangjing/LRDTN
.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.