Xinhang Feng , Jiejun Huang , Ximing Chen , Han Zhou , Ming Zhang , Chuan Zhang , Fawang Ye
{"title":"Research on hyperspectral remote sensing alteration mineral mapping using an improved ViT model","authors":"Xinhang Feng , Jiejun Huang , Ximing Chen , Han Zhou , Ming Zhang , Chuan Zhang , Fawang Ye","doi":"10.1016/j.cageo.2025.106037","DOIUrl":null,"url":null,"abstract":"<div><div>The distribution of altered minerals is a key indicator for finding strategic minerals such as uranium, cobalt, nickel, copper and zinc. In recent years, deep learning has shown outstanding advantages in the field of hyperspectral altered mineral mapping. However, constructing a large volume of high-quality training samples remains time-consuming and labor-intensive. Moreover, many models suffer from limited generalization capability—performing well on training data but exhibiting significant performance degradation on test datasets or in real-world applications. Therefore, a semi-automatic sample construction method was proposed. The sample construction involves three steps. Firstly, using mixed pixel decomposition to extract mineral abundance, then screening samples via mixed matching, and finally enhancing classification accuracy with spectral characteristic quantification. Experimental results show that the test accuracy of the dataset generated by the semi-automated method on the ViT model reached 92.81 %, which is close to that of manually labeled samples at 93.29 %. In terms of models, an improved Vision Transformer (ViT) model was proposed. The SpecPool-Transformer model (SPT) integrates the Grouped Spectral Embedding Module (GSE) and the Convolution-Pooling Module (CPM) to enhance the extraction of adjacent band features from the spectral curves. Additionally, the model's application to cross-source data was achieved through transfer learning. On the SASI dataset of the Baiyanghe uranium deposit, the overall accuracy (OA) and average accuracy (AA) of SpecPool-Transformer reached 96.76 % and 95.14 %, respectively, representing improvements of 3.95 % and 6.11 % over the original ViT model. In the generalization test, the proposed method achieved an OA of 86.10 % and an AA of 83.74 % on the SASI aerial dataset No.1007, outperforming the second-best model, LightGBM, by 20.22 % and 31.15 %, respectively. Field validation results further confirm the high reliability of the proposed model in large-scale alteration mineral mapping across data sources, making it suitable for rapid and extensive alteration mineral mapping applications.</div></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"206 ","pages":"Article 106037"},"PeriodicalIF":4.4000,"publicationDate":"2025-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300425001876","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The distribution of altered minerals is a key indicator for finding strategic minerals such as uranium, cobalt, nickel, copper and zinc. In recent years, deep learning has shown outstanding advantages in the field of hyperspectral altered mineral mapping. However, constructing a large volume of high-quality training samples remains time-consuming and labor-intensive. Moreover, many models suffer from limited generalization capability—performing well on training data but exhibiting significant performance degradation on test datasets or in real-world applications. Therefore, a semi-automatic sample construction method was proposed. The sample construction involves three steps. Firstly, using mixed pixel decomposition to extract mineral abundance, then screening samples via mixed matching, and finally enhancing classification accuracy with spectral characteristic quantification. Experimental results show that the test accuracy of the dataset generated by the semi-automated method on the ViT model reached 92.81 %, which is close to that of manually labeled samples at 93.29 %. In terms of models, an improved Vision Transformer (ViT) model was proposed. The SpecPool-Transformer model (SPT) integrates the Grouped Spectral Embedding Module (GSE) and the Convolution-Pooling Module (CPM) to enhance the extraction of adjacent band features from the spectral curves. Additionally, the model's application to cross-source data was achieved through transfer learning. On the SASI dataset of the Baiyanghe uranium deposit, the overall accuracy (OA) and average accuracy (AA) of SpecPool-Transformer reached 96.76 % and 95.14 %, respectively, representing improvements of 3.95 % and 6.11 % over the original ViT model. In the generalization test, the proposed method achieved an OA of 86.10 % and an AA of 83.74 % on the SASI aerial dataset No.1007, outperforming the second-best model, LightGBM, by 20.22 % and 31.15 %, respectively. Field validation results further confirm the high reliability of the proposed model in large-scale alteration mineral mapping across data sources, making it suitable for rapid and extensive alteration mineral mapping applications.
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.