ATSFCNN: A Novel Attention-based Triple-Stream Fused CNN Model for Hyperspectral Image Classification

Machine Learning: Science and Technology Pub Date : 2024-01-10 DOI:10.1088/2632-2153/ad1d05

Jizhen Cai, Clotilde Boust, Alamin Mansouri

{"title":"ATSFCNN: A Novel Attention-based Triple-Stream Fused CNN Model for Hyperspectral Image Classification","authors":"Jizhen Cai, Clotilde Boust, Alamin Mansouri","doi":"10.1088/2632-2153/ad1d05","DOIUrl":null,"url":null,"abstract":"\n Recently, the Convolutional Neural Network (CNN) has gained increasing importance in hyperspectral image classification thanks to its superior performance. However, most of the previous research has mainly focused on 2D-CNN, and the limited applications of 3D-CNN have been attributed to its complexity, despite its potential to enhance information extraction between adjacent channels of the image. Moreover, 1D-CNN is typically restricted to the field of signal processing as it ignores the spatial information of hyperspectral images. In this paper, we propose a novel CNN model named ATSFCNN (Attention-based Triple-Stream Fused Convolutional Neural Network) that fuses the features of 1D-CNN, 2D-CNN, and 3D-CNN to consider all the relevant information of the hyperspectral dataset. Our contributions are twofold: First, we propose a strategy to extract and homogenize features from 1D, 2D, and 3D CNN. Secondly, we propose a way to efficiently fuse these features. This attention-based methodology adeptly integrates features from the triple streams, thereby transcending the former limitations of singular stream utilization. Consequently, it becomes capable of attaining elevated outcomes in the context of hyperspectral classification, marked by increased levels of both accuracy and stability. We compared the results of ATSFCNN with those of other deep learning models, including 1D-CNN, 2D-CNN, 2D-CNN+PCA, 3D-CNN, and 3D-CNN+PCA, and demonstrated its superior performance and robustness. Quantitative assessments, predicated on the metrics of Overall Accuracy (OA), Average Accuracy (AA), and Kappa Coefficient (κ) emphatically corroborate the preeminence of ATSFCNN. Notably, spanning three remote sensing datasets, ATSFCNN consistently achieves peak levels of Overall Accuracy, quantified at 98.38%, 97.09%, and 96.93% respectively. This prowess is further accentuated by concomitant Average Accuracy scores of 98.47%, 95.80%, and 95.80%, as well as Kappa Coefficient values amounting to 97.41%, 96.14%, and 95.21%.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"83 14","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning: Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad1d05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, the Convolutional Neural Network (CNN) has gained increasing importance in hyperspectral image classification thanks to its superior performance. However, most of the previous research has mainly focused on 2D-CNN, and the limited applications of 3D-CNN have been attributed to its complexity, despite its potential to enhance information extraction between adjacent channels of the image. Moreover, 1D-CNN is typically restricted to the field of signal processing as it ignores the spatial information of hyperspectral images. In this paper, we propose a novel CNN model named ATSFCNN (Attention-based Triple-Stream Fused Convolutional Neural Network) that fuses the features of 1D-CNN, 2D-CNN, and 3D-CNN to consider all the relevant information of the hyperspectral dataset. Our contributions are twofold: First, we propose a strategy to extract and homogenize features from 1D, 2D, and 3D CNN. Secondly, we propose a way to efficiently fuse these features. This attention-based methodology adeptly integrates features from the triple streams, thereby transcending the former limitations of singular stream utilization. Consequently, it becomes capable of attaining elevated outcomes in the context of hyperspectral classification, marked by increased levels of both accuracy and stability. We compared the results of ATSFCNN with those of other deep learning models, including 1D-CNN, 2D-CNN, 2D-CNN+PCA, 3D-CNN, and 3D-CNN+PCA, and demonstrated its superior performance and robustness. Quantitative assessments, predicated on the metrics of Overall Accuracy (OA), Average Accuracy (AA), and Kappa Coefficient (κ) emphatically corroborate the preeminence of ATSFCNN. Notably, spanning three remote sensing datasets, ATSFCNN consistently achieves peak levels of Overall Accuracy, quantified at 98.38%, 97.09%, and 96.93% respectively. This prowess is further accentuated by concomitant Average Accuracy scores of 98.47%, 95.80%, and 95.80%, as well as Kappa Coefficient values amounting to 97.41%, 96.14%, and 95.21%.

查看原文本刊更多论文

ATSFCNN：用于高光谱图像分类的基于注意力的新型三流融合 CNN 模型

最近，卷积神经网络（CNN）凭借其卓越的性能，在高光谱图像分类领域的重要性与日俱增。然而，之前的大多数研究主要集中在二维卷积神经网络（2D-CNN）上，三维卷积神经网络（3D-CNN）的应用有限，原因在于其复杂性，尽管它具有增强图像相邻通道间信息提取的潜力。此外，1D-CNN 通常仅限于信号处理领域，因为它忽略了高光谱图像的空间信息。在本文中，我们提出了一种名为 ATSFCNN（基于注意力的三流融合卷积神经网络）的新型 CNN 模型，该模型融合了一维 CNN、二维 CNN 和三维 CNN 的特征，以考虑高光谱数据集的所有相关信息。我们的贡献有两个方面：首先，我们提出了一种从一维、二维和三维 CNN 中提取和同质化特征的策略。其次，我们提出了一种有效融合这些特征的方法。这种基于注意力的方法能够巧妙地整合来自三重流的特征，从而超越了以往单一流利用的局限性。因此，它能够在高光谱分类中获得更高的结果，其特点是准确性和稳定性都得到了提高。我们将 ATSFCNN 的结果与其他深度学习模型（包括一维-CNN、二维-CNN、二维-CNN+PCA、三维-CNN 和三维-CNN+PCA）的结果进行了比较，并证明了其卓越的性能和鲁棒性。根据总体准确率（OA）、平均准确率（AA）和卡帕系数（κ）等指标进行的定量评估有力地证实了 ATSFCNN 的卓越性能。值得注意的是，在三个遥感数据集中，ATSFCNN 的总体准确率始终保持在最高水平，分别为 98.38%、97.09% 和 96.93%。平均准确率分别为 98.47%、95.80% 和 95.80%，Kappa 系数分别为 97.41%、96.14% 和 95.21%，进一步彰显了 ATSFCNN 的卓越性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning: Science and Technology

自引率

0.00%

发文量