Hui Yang , Zhipeng Jiang , Yaobo Zhang , Yanlan Wu , Heng Luo , Peng Zhang , Biao Wang
{"title":"A high-resolution remote sensing land use/land cover classification method based on multi-level features adaptation of segment anything model","authors":"Hui Yang , Zhipeng Jiang , Yaobo Zhang , Yanlan Wu , Heng Luo , Peng Zhang , Biao Wang","doi":"10.1016/j.jag.2025.104659","DOIUrl":null,"url":null,"abstract":"<div><div>Land use/land cover (LULC) classification based on deep learning techniques is a significant research area for analyzing high-resolution remote sensing(HRRS) images. However, due to the limitation of available samples and model feature extraction capability, the current deep learning methods suffer from weak generalization ability for widespread and effective application across diverse HRRS scenarios. To address this problem, we propose an innovative network model named multi-level feature adaptation-segment anything Model (MLFA-SAM). The model employs a three-level fine-tuning strategy to adapt the SAM foundation model for remote sensing LULC classification.<!--> <!-->The proposed MLFA-SAM significantly enhances high-precision classification performance across diverse HRRS scenarios. Specifically,<!--> <!-->the domain distribution shift adaptation (DDSA) level is designed to adjust the input image modality for SAM and initially extract features and overcome the domain distribution shift between remote sensing images and the natural images used by the SAM. Then, we designed depthwise low-rank adaptation (DLRA) strategy to optimally fine-tune the frozen SAM parameters. Finally, we improved SAM’s mask decoder to generate high-quality multi-class masks required for LULC classification. Experimental results demonstrate that the MLFA-SAM model surpasses several existing state-of-the-art(SOTA) methods on the HRLC dataset and the ISPRS Potsdam dataset. Quantitative evaluations demonstrate that MLFA-SAM, with its concise yet efficient architecture, achieves 66.77% mIoU and 86.02% OA on the HRLC dataset. Notably, the integration of near-infrared (Nir) bands further enhances its performance to 68.43% mIoU and 87.91% OA. The generalization test on the LoveDA dataset, along with four test HRRS images exhibiting spatiotemporal and semantic scene differences, further demonstrate that MLFA-SAM possesses a stronger generalization ability compared to existing methods and shows greater potential for practical applications.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"141 ","pages":"Article 104659"},"PeriodicalIF":7.6000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225003061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0
Abstract
Land use/land cover (LULC) classification based on deep learning techniques is a significant research area for analyzing high-resolution remote sensing(HRRS) images. However, due to the limitation of available samples and model feature extraction capability, the current deep learning methods suffer from weak generalization ability for widespread and effective application across diverse HRRS scenarios. To address this problem, we propose an innovative network model named multi-level feature adaptation-segment anything Model (MLFA-SAM). The model employs a three-level fine-tuning strategy to adapt the SAM foundation model for remote sensing LULC classification. The proposed MLFA-SAM significantly enhances high-precision classification performance across diverse HRRS scenarios. Specifically, the domain distribution shift adaptation (DDSA) level is designed to adjust the input image modality for SAM and initially extract features and overcome the domain distribution shift between remote sensing images and the natural images used by the SAM. Then, we designed depthwise low-rank adaptation (DLRA) strategy to optimally fine-tune the frozen SAM parameters. Finally, we improved SAM’s mask decoder to generate high-quality multi-class masks required for LULC classification. Experimental results demonstrate that the MLFA-SAM model surpasses several existing state-of-the-art(SOTA) methods on the HRLC dataset and the ISPRS Potsdam dataset. Quantitative evaluations demonstrate that MLFA-SAM, with its concise yet efficient architecture, achieves 66.77% mIoU and 86.02% OA on the HRLC dataset. Notably, the integration of near-infrared (Nir) bands further enhances its performance to 68.43% mIoU and 87.91% OA. The generalization test on the LoveDA dataset, along with four test HRRS images exhibiting spatiotemporal and semantic scene differences, further demonstrate that MLFA-SAM possesses a stronger generalization ability compared to existing methods and shows greater potential for practical applications.
期刊介绍:
The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.