Jingfeng Zhang, Bin Zhou, Jin Lu, Ben Wang, Zhipeng Ding, Songyue He
{"title":"Vegetation extraction from Landsat8 operational land imager remote sensing imagery based on Attention U-Net and vegetation spectral features","authors":"Jingfeng Zhang, Bin Zhou, Jin Lu, Ben Wang, Zhipeng Ding, Songyue He","doi":"10.1117/1.jrs.18.032403","DOIUrl":null,"url":null,"abstract":"The rapid, accurate, and intelligent extraction of vegetation areas is of great significance for conducting research on forest resource inventory, climate change, and the greenhouse effect. Currently, existing semantic segmentation models suffer from limitations such as insufficient extraction accuracy (ACC) and unbalanced positive and negative categories in datasets. Therefore, we propose the Attention U-Net model for vegetation extraction from Landsat8 operational land imager remote sensing images. By combining the convolutional block attention module, Visual Geometry Group 16 backbone network, and Dice loss, the model alleviates the phenomenon of omission and misclassification of the fragmented vegetation areas and the imbalance of positive and negative classes. In addition, to test the influence of remote sensing images with different band combinations on the ACC of vegetation extraction, we introduce near-infrared (NIR) and short-wave infrared (SWIR) spectral information to conduct band combination operations, thus forming three datasets, namely, the 432 dataset (R, G, B), 543 dataset (NIR, R, G), and 654 dataset (SWIR, NIR, R). In addition, to validate the effectiveness of the proposed model, it was compared with three classic semantic segmentation models, namely, PSP-Net, DeepLabv3+, and U-Net. Experimental results demonstrate that all models exhibit improved extraction performance on false color datasets compared with the true color dataset, particularly on the 654 dataset where vegetation extraction performance is optimal. Moreover, the proposed Attention U-Net achieves the highest overall ACC with mean intersection over union, mean pixel ACC, and ACC reaching 0.877, 0.940, and 0.946, respectively, providing substantial evidence for the effectiveness of the proposed model. Furthermore, the model demonstrates good generalizability and transferability when tested in other regions, indicating its potential for further application and promotion.","PeriodicalId":54879,"journal":{"name":"Journal of Applied Remote Sensing","volume":"157 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1117/1.jrs.18.032403","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid, accurate, and intelligent extraction of vegetation areas is of great significance for conducting research on forest resource inventory, climate change, and the greenhouse effect. Currently, existing semantic segmentation models suffer from limitations such as insufficient extraction accuracy (ACC) and unbalanced positive and negative categories in datasets. Therefore, we propose the Attention U-Net model for vegetation extraction from Landsat8 operational land imager remote sensing images. By combining the convolutional block attention module, Visual Geometry Group 16 backbone network, and Dice loss, the model alleviates the phenomenon of omission and misclassification of the fragmented vegetation areas and the imbalance of positive and negative classes. In addition, to test the influence of remote sensing images with different band combinations on the ACC of vegetation extraction, we introduce near-infrared (NIR) and short-wave infrared (SWIR) spectral information to conduct band combination operations, thus forming three datasets, namely, the 432 dataset (R, G, B), 543 dataset (NIR, R, G), and 654 dataset (SWIR, NIR, R). In addition, to validate the effectiveness of the proposed model, it was compared with three classic semantic segmentation models, namely, PSP-Net, DeepLabv3+, and U-Net. Experimental results demonstrate that all models exhibit improved extraction performance on false color datasets compared with the true color dataset, particularly on the 654 dataset where vegetation extraction performance is optimal. Moreover, the proposed Attention U-Net achieves the highest overall ACC with mean intersection over union, mean pixel ACC, and ACC reaching 0.877, 0.940, and 0.946, respectively, providing substantial evidence for the effectiveness of the proposed model. Furthermore, the model demonstrates good generalizability and transferability when tested in other regions, indicating its potential for further application and promotion.
期刊介绍:
The Journal of Applied Remote Sensing is a peer-reviewed journal that optimizes the communication of concepts, information, and progress among the remote sensing community.