Mei Zhang, Lingling Liu, Yongtao Pei, Guojing Xie, Jinghua Wen
{"title":"Semantic segmentation of multi-scale remote sensing images with contextual feature enhancement","authors":"Mei Zhang, Lingling Liu, Yongtao Pei, Guojing Xie, Jinghua Wen","doi":"10.1007/s00371-024-03419-x","DOIUrl":null,"url":null,"abstract":"<p>Remote sensing images exhibit complex characteristics such as irregular multi-scale feature shapes, significant scale variations, and imbalanced sizes between different categories. These characteristics lead to a decrease in the accuracy of semantic segmentation in remote sensing images. In view of this problem, this paper presents a context feature-enhanced multi-scale remote sensing image semantic segmentation method. It utilizes a context aggregation module for global context co-aggregation, obtaining feature representations at different levels through self-similarity calculation and convolution operations. The processed features are input into a feature enhancement module, introducing a channel gate mechanism to enhance the expressive power of feature maps. This mechanism enhances feature representations by leveraging channel correlations and weighted fusion operations. Additionally, pyramid pooling is employed to capture multi-scale information from the enhanced features, so as to improve the performance and accuracy of the semantic segmentation model. Experimental results on the Vaihingen and Potsdam datasets (which are indeed publicly released at the URL: https://www.isprs.org/education/benchmarks/UrbanSemLab/Default.aspx) demonstrate significant improvements in the performance and accuracy of the proposed method (whose algorithm source code is indeed publicly released in Sect. 3.4), compared to previous multi-scale remote sensing image semantic segmentation approaches, verifying its effectiveness.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03419-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Remote sensing images exhibit complex characteristics such as irregular multi-scale feature shapes, significant scale variations, and imbalanced sizes between different categories. These characteristics lead to a decrease in the accuracy of semantic segmentation in remote sensing images. In view of this problem, this paper presents a context feature-enhanced multi-scale remote sensing image semantic segmentation method. It utilizes a context aggregation module for global context co-aggregation, obtaining feature representations at different levels through self-similarity calculation and convolution operations. The processed features are input into a feature enhancement module, introducing a channel gate mechanism to enhance the expressive power of feature maps. This mechanism enhances feature representations by leveraging channel correlations and weighted fusion operations. Additionally, pyramid pooling is employed to capture multi-scale information from the enhanced features, so as to improve the performance and accuracy of the semantic segmentation model. Experimental results on the Vaihingen and Potsdam datasets (which are indeed publicly released at the URL: https://www.isprs.org/education/benchmarks/UrbanSemLab/Default.aspx) demonstrate significant improvements in the performance and accuracy of the proposed method (whose algorithm source code is indeed publicly released in Sect. 3.4), compared to previous multi-scale remote sensing image semantic segmentation approaches, verifying its effectiveness.