Yuqing Li , Yansong Song , Keyan Dong , Gong Zhang , Yun Fu , Gangqi Yan , Yanbo Wang , Lei Zhang , Tianci Liu
{"title":"DSaC-ViT: Multi-scale guided upsampling fusion and parallel fusion vision transformer for hyperspectral image classification","authors":"Yuqing Li , Yansong Song , Keyan Dong , Gong Zhang , Yun Fu , Gangqi Yan , Yanbo Wang , Lei Zhang , Tianci Liu","doi":"10.1016/j.compeleceng.2026.111021","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral images (HSI) capture rich spectral information for accurate land-cover classification. Recently, models based on hybrid architectures of convolutional neural networks (CNNs) and Transformers have been widely utilized for hyperspectral classification. However, a significant challenge is fully integrating the local features from CNN with the global features from Transformers. To alleviate this problem, we proposed an upsampling dual-scale fusion and self-attention convolutional parallel fusion vision Transformer (DSaC-ViT), which consists of a parallel self-attention convolutional vision Transformer (PSCViT) and a plug-and-play multi-scale guided upsampling feature fusion module (MGUFFM). PSCViT integrates the convolution and self-attention modules in parallel. Interacting between different patches via global token obtains global information representation. Adaptive parameters are then utilized to fuse this representation with local information extracted by CNN, thereby achieving granularity alignment. PSCViT can effectively extract and fuse local and global features. MGUFFM extracts spatial-spectral guidance features via a dual-branch structure to guide the upsampling fusion of high-level feature maps. This process effectively recovers missing spatial and spectral information. Four representative HSI datasets, encompassing agricultural, forest, urban, and wetland, were utilized in our extensive experiments. The results indicate that our proposed model outperforms other classification methods for HSI classification.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"132 ","pages":"Article 111021"},"PeriodicalIF":4.9000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790626000935","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/7 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Hyperspectral images (HSI) capture rich spectral information for accurate land-cover classification. Recently, models based on hybrid architectures of convolutional neural networks (CNNs) and Transformers have been widely utilized for hyperspectral classification. However, a significant challenge is fully integrating the local features from CNN with the global features from Transformers. To alleviate this problem, we proposed an upsampling dual-scale fusion and self-attention convolutional parallel fusion vision Transformer (DSaC-ViT), which consists of a parallel self-attention convolutional vision Transformer (PSCViT) and a plug-and-play multi-scale guided upsampling feature fusion module (MGUFFM). PSCViT integrates the convolution and self-attention modules in parallel. Interacting between different patches via global token obtains global information representation. Adaptive parameters are then utilized to fuse this representation with local information extracted by CNN, thereby achieving granularity alignment. PSCViT can effectively extract and fuse local and global features. MGUFFM extracts spatial-spectral guidance features via a dual-branch structure to guide the upsampling fusion of high-level feature maps. This process effectively recovers missing spatial and spectral information. Four representative HSI datasets, encompassing agricultural, forest, urban, and wetland, were utilized in our extensive experiments. The results indicate that our proposed model outperforms other classification methods for HSI classification.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.