Land Cover Classification in the Antioquia Region of the Tropical Andes Using NICFI Satellite Data Program Imagery and Semantic Segmentation Techniques

IF 2 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Data Pub Date : 2023-12-04 DOI:10.3390/data8120185

Luisa F. Gomez-Ossa, G. Sanchez-Torres, John W. Branch-Bedoya

{"title":"Land Cover Classification in the Antioquia Region of the Tropical Andes Using NICFI Satellite Data Program Imagery and Semantic Segmentation Techniques","authors":"Luisa F. Gomez-Ossa, G. Sanchez-Torres, John W. Branch-Bedoya","doi":"10.3390/data8120185","DOIUrl":null,"url":null,"abstract":"Land cover classification, generated from satellite imagery through semantic segmentation, has become fundamental for monitoring land use and land cover change (LULCC). The tropical Andes territory provides opportunities due to its significance in the provision of ecosystem services. However, the lack of reliable data for this region, coupled with challenges arising from its mountainous topography and diverse ecosystems, hinders the description of its coverage. Therefore, this research proposes the Tropical Andes Land Cover Dataset (TALANDCOVER). It is constructed from three sample strategies: aleatory, minimum 50%, and 70% of representation per class, which address imbalanced geographic data. Additionally, the U-Net deep learning model is applied for enhanced and tailored classification of land covers. Using high-resolution data from the NICFI program, our analysis focuses on the Department of Antioquia in Colombia. The TALANDCOVER dataset, presented in TIF format, comprises multiband R-G-B-NIR images paired with six labels (dense forest, grasslands, heterogeneous agricultural areas, bodies of water, built-up areas, and bare-degraded lands) with an estimated 0.76 F1 score compared to ground truth data by expert knowledge and surpassing the precision of existing global cover maps for the study area. To the best of our knowledge, this work is a pioneer in its release of open-source data for segmenting coverages with pixel-wise labeled NICFI imagery at a 4.77 m resolution. The experiments carried out with the application of the sample strategies and models show F1 score values of 0.70, 0.72, and 0.74 for aleatory, balanced 50%, and balanced 70%, respectively, over the expert segmented sample (ground truth), which suggests that the personalized application of our deep learning model, together with the TALANDCOVER dataset offers different possibilities that facilitate the training of deep architectures for the classification of large-scale covers in complex areas, such as the tropical Andes. This advance has significant potential for decision making, emphasizing sustainable land use and the conservation of natural resources.","PeriodicalId":36824,"journal":{"name":"Data","volume":"4 11","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.3390/data8120185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Land cover classification, generated from satellite imagery through semantic segmentation, has become fundamental for monitoring land use and land cover change (LULCC). The tropical Andes territory provides opportunities due to its significance in the provision of ecosystem services. However, the lack of reliable data for this region, coupled with challenges arising from its mountainous topography and diverse ecosystems, hinders the description of its coverage. Therefore, this research proposes the Tropical Andes Land Cover Dataset (TALANDCOVER). It is constructed from three sample strategies: aleatory, minimum 50%, and 70% of representation per class, which address imbalanced geographic data. Additionally, the U-Net deep learning model is applied for enhanced and tailored classification of land covers. Using high-resolution data from the NICFI program, our analysis focuses on the Department of Antioquia in Colombia. The TALANDCOVER dataset, presented in TIF format, comprises multiband R-G-B-NIR images paired with six labels (dense forest, grasslands, heterogeneous agricultural areas, bodies of water, built-up areas, and bare-degraded lands) with an estimated 0.76 F1 score compared to ground truth data by expert knowledge and surpassing the precision of existing global cover maps for the study area. To the best of our knowledge, this work is a pioneer in its release of open-source data for segmenting coverages with pixel-wise labeled NICFI imagery at a 4.77 m resolution. The experiments carried out with the application of the sample strategies and models show F1 score values of 0.70, 0.72, and 0.74 for aleatory, balanced 50%, and balanced 70%, respectively, over the expert segmented sample (ground truth), which suggests that the personalized application of our deep learning model, together with the TALANDCOVER dataset offers different possibilities that facilitate the training of deep architectures for the classification of large-scale covers in complex areas, such as the tropical Andes. This advance has significant potential for decision making, emphasizing sustainable land use and the conservation of natural resources.

查看原文本刊更多论文

利用 NICFI 卫星数据计划图像和语义分割技术对热带安第斯山脉安蒂奥基亚地区进行土地覆被分类

通过语义分割从卫星图像生成的土地覆盖分类已经成为监测土地利用和土地覆盖变化(LULCC)的基础。热带安第斯山脉地区因其在提供生态系统服务方面的重要性而提供了机会。然而，该地区缺乏可靠的数据，再加上山区地形和生态系统多样性带来的挑战，阻碍了对其覆盖范围的描述。为此，本研究提出了热带安第斯山脉土地覆盖数据集(TALANDCOVER)。它由三种样本策略构成:每个班级的代表率最低为50%和70%，这些策略解决了地理数据不平衡的问题。此外，U-Net深度学习模型应用于增强和定制的土地覆盖分类。使用来自NICFI项目的高分辨率数据，我们的分析集中在哥伦比亚的安蒂奥基亚省。TALANDCOVER数据集以TIF格式呈现，包括多波段R-G-B-NIR图像与六个标签(茂密森林、草原、异质农业区、水体、建成区和裸露退化土地)配对，与专家知识的地面真实数据相比，估计F1得分为0.76，超过了研究区域现有全球覆盖地图的精度。据我们所知，这项工作是发布开源数据的先驱，该数据用于在4.77米分辨率下使用逐像素标记的NICFI图像分割覆盖范围。使用样本策略和模型进行的实验显示，与专家分割样本(ground truth)相比，在选择性、平衡50%和平衡70%的情况下，F1得分分别为0.70、0.72和0.74，这表明我们的深度学习模型与TALANDCOVER数据集的个性化应用为复杂地区大规模覆盖分类的深度架构训练提供了不同的可能性。比如热带安第斯山脉。这一进展具有重大的决策潜力，强调可持续的土地利用和自然资源的保护。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊