{"title":"ATSFCNN:用于高光谱图像分类的基于注意力的新型三流融合 CNN 模型","authors":"Jizhen Cai, Clotilde Boust, Alamin Mansouri","doi":"10.1088/2632-2153/ad1d05","DOIUrl":null,"url":null,"abstract":"\n Recently, the Convolutional Neural Network (CNN) has gained increasing importance in hyperspectral image classification thanks to its superior performance. However, most of the previous research has mainly focused on 2D-CNN, and the limited applications of 3D-CNN have been attributed to its complexity, despite its potential to enhance information extraction between adjacent channels of the image. Moreover, 1D-CNN is typically restricted to the field of signal processing as it ignores the spatial information of hyperspectral images. In this paper, we propose a novel CNN model named ATSFCNN (Attention-based Triple-Stream Fused Convolutional Neural Network) that fuses the features of 1D-CNN, 2D-CNN, and 3D-CNN to consider all the relevant information of the hyperspectral dataset. Our contributions are twofold: First, we propose a strategy to extract and homogenize features from 1D, 2D, and 3D CNN. Secondly, we propose a way to efficiently fuse these features. This attention-based methodology adeptly integrates features from the triple streams, thereby transcending the former limitations of singular stream utilization. Consequently, it becomes capable of attaining elevated outcomes in the context of hyperspectral classification, marked by increased levels of both accuracy and stability. We compared the results of ATSFCNN with those of other deep learning models, including 1D-CNN, 2D-CNN, 2D-CNN+PCA, 3D-CNN, and 3D-CNN+PCA, and demonstrated its superior performance and robustness. Quantitative assessments, predicated on the metrics of Overall Accuracy (OA), Average Accuracy (AA), and Kappa Coefficient (κ) emphatically corroborate the preeminence of ATSFCNN. Notably, spanning three remote sensing datasets, ATSFCNN consistently achieves peak levels of Overall Accuracy, quantified at 98.38%, 97.09%, and 96.93% respectively. This prowess is further accentuated by concomitant Average Accuracy scores of 98.47%, 95.80%, and 95.80%, as well as Kappa Coefficient values amounting to 97.41%, 96.14%, and 95.21%.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"83 14","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ATSFCNN: A Novel Attention-based Triple-Stream Fused CNN Model for Hyperspectral Image Classification\",\"authors\":\"Jizhen Cai, Clotilde Boust, Alamin Mansouri\",\"doi\":\"10.1088/2632-2153/ad1d05\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Recently, the Convolutional Neural Network (CNN) has gained increasing importance in hyperspectral image classification thanks to its superior performance. However, most of the previous research has mainly focused on 2D-CNN, and the limited applications of 3D-CNN have been attributed to its complexity, despite its potential to enhance information extraction between adjacent channels of the image. Moreover, 1D-CNN is typically restricted to the field of signal processing as it ignores the spatial information of hyperspectral images. In this paper, we propose a novel CNN model named ATSFCNN (Attention-based Triple-Stream Fused Convolutional Neural Network) that fuses the features of 1D-CNN, 2D-CNN, and 3D-CNN to consider all the relevant information of the hyperspectral dataset. Our contributions are twofold: First, we propose a strategy to extract and homogenize features from 1D, 2D, and 3D CNN. Secondly, we propose a way to efficiently fuse these features. This attention-based methodology adeptly integrates features from the triple streams, thereby transcending the former limitations of singular stream utilization. Consequently, it becomes capable of attaining elevated outcomes in the context of hyperspectral classification, marked by increased levels of both accuracy and stability. We compared the results of ATSFCNN with those of other deep learning models, including 1D-CNN, 2D-CNN, 2D-CNN+PCA, 3D-CNN, and 3D-CNN+PCA, and demonstrated its superior performance and robustness. Quantitative assessments, predicated on the metrics of Overall Accuracy (OA), Average Accuracy (AA), and Kappa Coefficient (κ) emphatically corroborate the preeminence of ATSFCNN. Notably, spanning three remote sensing datasets, ATSFCNN consistently achieves peak levels of Overall Accuracy, quantified at 98.38%, 97.09%, and 96.93% respectively. This prowess is further accentuated by concomitant Average Accuracy scores of 98.47%, 95.80%, and 95.80%, as well as Kappa Coefficient values amounting to 97.41%, 96.14%, and 95.21%.\",\"PeriodicalId\":503691,\"journal\":{\"name\":\"Machine Learning: Science and Technology\",\"volume\":\"83 14\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning: Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/ad1d05\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning: Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad1d05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ATSFCNN: A Novel Attention-based Triple-Stream Fused CNN Model for Hyperspectral Image Classification
Recently, the Convolutional Neural Network (CNN) has gained increasing importance in hyperspectral image classification thanks to its superior performance. However, most of the previous research has mainly focused on 2D-CNN, and the limited applications of 3D-CNN have been attributed to its complexity, despite its potential to enhance information extraction between adjacent channels of the image. Moreover, 1D-CNN is typically restricted to the field of signal processing as it ignores the spatial information of hyperspectral images. In this paper, we propose a novel CNN model named ATSFCNN (Attention-based Triple-Stream Fused Convolutional Neural Network) that fuses the features of 1D-CNN, 2D-CNN, and 3D-CNN to consider all the relevant information of the hyperspectral dataset. Our contributions are twofold: First, we propose a strategy to extract and homogenize features from 1D, 2D, and 3D CNN. Secondly, we propose a way to efficiently fuse these features. This attention-based methodology adeptly integrates features from the triple streams, thereby transcending the former limitations of singular stream utilization. Consequently, it becomes capable of attaining elevated outcomes in the context of hyperspectral classification, marked by increased levels of both accuracy and stability. We compared the results of ATSFCNN with those of other deep learning models, including 1D-CNN, 2D-CNN, 2D-CNN+PCA, 3D-CNN, and 3D-CNN+PCA, and demonstrated its superior performance and robustness. Quantitative assessments, predicated on the metrics of Overall Accuracy (OA), Average Accuracy (AA), and Kappa Coefficient (κ) emphatically corroborate the preeminence of ATSFCNN. Notably, spanning three remote sensing datasets, ATSFCNN consistently achieves peak levels of Overall Accuracy, quantified at 98.38%, 97.09%, and 96.93% respectively. This prowess is further accentuated by concomitant Average Accuracy scores of 98.47%, 95.80%, and 95.80%, as well as Kappa Coefficient values amounting to 97.41%, 96.14%, and 95.21%.