{"title":"Enhancing land use and land cover classification with deep learning-based satellite imagery segmentation","authors":"Tsion Fekadu Deressu , Amanuel Kumsa Bojer , Taye Girma Debelee , Worku Gachena Negera , Saralees Nadarajah , Kena Wendimu Gebissa","doi":"10.1016/j.jag.2025.104839","DOIUrl":null,"url":null,"abstract":"<div><div>Semantic segmentation of satellite imagery plays a vital role in applications such as sustainable development, agriculture, forestry, urban planning, and climate change monitoring. Despite its importance, optimizing deep learning models for this task remains challenging. This study evaluates several advanced deep learning architectures UNet, LinkNet, DeepLabV3+, and a modified version, AE-DeepLabV3+ in combination with different backbone networks, including ResNet101, ResNet152, Xception, MobileNetV2 and, EfficientNetV2. The objective is to identify the most effective model for classifying satellite images into eight land cover categories: built-up areas, roads, water bodies, agricultural land, shrubland, forest, grassland, and others. A high-resolution dataset with corresponding segmentation masks was developed to support this analysis and serve as a resource for future research. Preprocessing steps included normalization and data augmentation techniques such as vertical and horizontal flipping and random brightness adjustments. Experimental results indicate that UNet with the Xception backbone, LinkNet with ResNet152, DeepLabV3+ with the Xception backbone, and AE-DeepLabV3+ with the Xception backbone achieved Dice coefficients of 85.7%, 86.7%, 90.4%, and 91.3%, respectively. Among these, AE-DeepLabV3+ with Xception demonstrated the highest segmentation accuracy. The findings are contextualized through comparison with recent studies, highlighting the model’s ability to generalize across diverse geographic regions. To enhance model transparency and interpretability, explainable AI (XAI) techniques Seg-Grad-CAM++ and Seg-Score-CAM are employed to visualize class-specific feature attributions and better understand the model’s decision-making process.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"144 ","pages":"Article 104839"},"PeriodicalIF":8.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225004868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic segmentation of satellite imagery plays a vital role in applications such as sustainable development, agriculture, forestry, urban planning, and climate change monitoring. Despite its importance, optimizing deep learning models for this task remains challenging. This study evaluates several advanced deep learning architectures UNet, LinkNet, DeepLabV3+, and a modified version, AE-DeepLabV3+ in combination with different backbone networks, including ResNet101, ResNet152, Xception, MobileNetV2 and, EfficientNetV2. The objective is to identify the most effective model for classifying satellite images into eight land cover categories: built-up areas, roads, water bodies, agricultural land, shrubland, forest, grassland, and others. A high-resolution dataset with corresponding segmentation masks was developed to support this analysis and serve as a resource for future research. Preprocessing steps included normalization and data augmentation techniques such as vertical and horizontal flipping and random brightness adjustments. Experimental results indicate that UNet with the Xception backbone, LinkNet with ResNet152, DeepLabV3+ with the Xception backbone, and AE-DeepLabV3+ with the Xception backbone achieved Dice coefficients of 85.7%, 86.7%, 90.4%, and 91.3%, respectively. Among these, AE-DeepLabV3+ with Xception demonstrated the highest segmentation accuracy. The findings are contextualized through comparison with recent studies, highlighting the model’s ability to generalize across diverse geographic regions. To enhance model transparency and interpretability, explainable AI (XAI) techniques Seg-Grad-CAM++ and Seg-Score-CAM are employed to visualize class-specific feature attributions and better understand the model’s decision-making process.
卫星图像的语义分割在可持续发展、农业、林业、城市规划和气候变化监测等领域具有重要的应用价值。尽管它很重要,但为这项任务优化深度学习模型仍然具有挑战性。本研究评估了几种先进的深度学习架构UNet、LinkNet、DeepLabV3+以及与不同骨干网(包括ResNet101、ResNet152、Xception、MobileNetV2和EfficientNetV2)相结合的改进版本AE-DeepLabV3+。目标是确定最有效的模型,将卫星图像分为八类土地覆盖:建成区、道路、水体、农田、灌木地、森林、草地和其他。开发了具有相应分割掩码的高分辨率数据集来支持这一分析,并作为未来研究的资源。预处理步骤包括标准化和数据增强技术,如垂直和水平翻转和随机亮度调整。实验结果表明,采用Xception骨干网的UNet、采用ResNet152的LinkNet、采用Xception骨干网的DeepLabV3+和采用Xception骨干网的AE-DeepLabV3+分别实现了85.7%、86.7%、90.4%和91.3%的Dice系数。其中,AE-DeepLabV3+ with Xception的分割准确率最高。通过与最近的研究进行比较,这些发现被置于背景中,突出了该模型在不同地理区域的推广能力。为了提高模型的透明度和可解释性,采用了可解释AI (XAI)技术Seg-Grad-CAM++和Seg-Score-CAM来可视化特定类别的特征属性,并更好地理解模型的决策过程。
期刊介绍:
The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.