DSaC-ViT: Multi-scale guided upsampling fusion and parallel fusion vision transformer for hyperspectral image classification

IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Yuqing Li , Yansong Song , Keyan Dong , Gong Zhang , Yun Fu , Gangqi Yan , Yanbo Wang , Lei Zhang , Tianci Liu
{"title":"DSaC-ViT: Multi-scale guided upsampling fusion and parallel fusion vision transformer for hyperspectral image classification","authors":"Yuqing Li ,&nbsp;Yansong Song ,&nbsp;Keyan Dong ,&nbsp;Gong Zhang ,&nbsp;Yun Fu ,&nbsp;Gangqi Yan ,&nbsp;Yanbo Wang ,&nbsp;Lei Zhang ,&nbsp;Tianci Liu","doi":"10.1016/j.compeleceng.2026.111021","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral images (HSI) capture rich spectral information for accurate land-cover classification. Recently, models based on hybrid architectures of convolutional neural networks (CNNs) and Transformers have been widely utilized for hyperspectral classification. However, a significant challenge is fully integrating the local features from CNN with the global features from Transformers. To alleviate this problem, we proposed an upsampling dual-scale fusion and self-attention convolutional parallel fusion vision Transformer (DSaC-ViT), which consists of a parallel self-attention convolutional vision Transformer (PSCViT) and a plug-and-play multi-scale guided upsampling feature fusion module (MGUFFM). PSCViT integrates the convolution and self-attention modules in parallel. Interacting between different patches via global token obtains global information representation. Adaptive parameters are then utilized to fuse this representation with local information extracted by CNN, thereby achieving granularity alignment. PSCViT can effectively extract and fuse local and global features. MGUFFM extracts spatial-spectral guidance features via a dual-branch structure to guide the upsampling fusion of high-level feature maps. This process effectively recovers missing spatial and spectral information. Four representative HSI datasets, encompassing agricultural, forest, urban, and wetland, were utilized in our extensive experiments. The results indicate that our proposed model outperforms other classification methods for HSI classification.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"132 ","pages":"Article 111021"},"PeriodicalIF":4.9000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790626000935","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/7 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Hyperspectral images (HSI) capture rich spectral information for accurate land-cover classification. Recently, models based on hybrid architectures of convolutional neural networks (CNNs) and Transformers have been widely utilized for hyperspectral classification. However, a significant challenge is fully integrating the local features from CNN with the global features from Transformers. To alleviate this problem, we proposed an upsampling dual-scale fusion and self-attention convolutional parallel fusion vision Transformer (DSaC-ViT), which consists of a parallel self-attention convolutional vision Transformer (PSCViT) and a plug-and-play multi-scale guided upsampling feature fusion module (MGUFFM). PSCViT integrates the convolution and self-attention modules in parallel. Interacting between different patches via global token obtains global information representation. Adaptive parameters are then utilized to fuse this representation with local information extracted by CNN, thereby achieving granularity alignment. PSCViT can effectively extract and fuse local and global features. MGUFFM extracts spatial-spectral guidance features via a dual-branch structure to guide the upsampling fusion of high-level feature maps. This process effectively recovers missing spatial and spectral information. Four representative HSI datasets, encompassing agricultural, forest, urban, and wetland, were utilized in our extensive experiments. The results indicate that our proposed model outperforms other classification methods for HSI classification.

Abstract Image

DSaC-ViT:用于高光谱图像分类的多尺度制导上采样融合与并行融合视觉转换器
高光谱图像(HSI)捕获丰富的光谱信息,用于准确的土地覆盖分类。近年来,基于卷积神经网络(cnn)和变压器混合架构的模型被广泛应用于高光谱分类。然而,一个重大的挑战是如何将CNN的局部特征与变形金刚的全局特征完全整合起来。为了解决这一问题,我们提出了一种上采样双尺度融合自注意卷积并行融合视觉变压器(DSaC-ViT),它由一个并行自注意卷积视觉变压器(PSCViT)和一个即插式多尺度引导上采样特征融合模块(MGUFFM)组成。PSCViT将卷积和自关注模块并行集成。不同补丁之间通过全局令牌进行交互,获得全局信息表示。然后利用自适应参数将该表示与CNN提取的局部信息融合,从而实现粒度对齐。PSCViT可以有效地提取和融合局部和全局特征。MGUFFM通过双分支结构提取空间光谱制导特征,引导高阶特征图的上采样融合。该过程有效地恢复了缺失的空间和光谱信息。在我们广泛的实验中,使用了四个具有代表性的HSI数据集,包括农业、森林、城市和湿地。结果表明,该模型在HSI分类中优于其他分类方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Electrical Engineering
Computers & Electrical Engineering 工程技术-工程:电子与电气
CiteScore
9.20
自引率
7.00%
发文量
661
审稿时长
47 days
期刊介绍: The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书