Land Cover Classification in the Antioquia Region of the Tropical Andes Using NICFI Satellite Data Program Imagery and Semantic Segmentation Techniques
Luisa F. Gomez-Ossa, G. Sanchez-Torres, John W. Branch-Bedoya
{"title":"Land Cover Classification in the Antioquia Region of the Tropical Andes Using NICFI Satellite Data Program Imagery and Semantic Segmentation Techniques","authors":"Luisa F. Gomez-Ossa, G. Sanchez-Torres, John W. Branch-Bedoya","doi":"10.3390/data8120185","DOIUrl":null,"url":null,"abstract":"Land cover classification, generated from satellite imagery through semantic segmentation, has become fundamental for monitoring land use and land cover change (LULCC). The tropical Andes territory provides opportunities due to its significance in the provision of ecosystem services. However, the lack of reliable data for this region, coupled with challenges arising from its mountainous topography and diverse ecosystems, hinders the description of its coverage. Therefore, this research proposes the Tropical Andes Land Cover Dataset (TALANDCOVER). It is constructed from three sample strategies: aleatory, minimum 50%, and 70% of representation per class, which address imbalanced geographic data. Additionally, the U-Net deep learning model is applied for enhanced and tailored classification of land covers. Using high-resolution data from the NICFI program, our analysis focuses on the Department of Antioquia in Colombia. The TALANDCOVER dataset, presented in TIF format, comprises multiband R-G-B-NIR images paired with six labels (dense forest, grasslands, heterogeneous agricultural areas, bodies of water, built-up areas, and bare-degraded lands) with an estimated 0.76 F1 score compared to ground truth data by expert knowledge and surpassing the precision of existing global cover maps for the study area. To the best of our knowledge, this work is a pioneer in its release of open-source data for segmenting coverages with pixel-wise labeled NICFI imagery at a 4.77 m resolution. The experiments carried out with the application of the sample strategies and models show F1 score values of 0.70, 0.72, and 0.74 for aleatory, balanced 50%, and balanced 70%, respectively, over the expert segmented sample (ground truth), which suggests that the personalized application of our deep learning model, together with the TALANDCOVER dataset offers different possibilities that facilitate the training of deep architectures for the classification of large-scale covers in complex areas, such as the tropical Andes. This advance has significant potential for decision making, emphasizing sustainable land use and the conservation of natural resources.","PeriodicalId":36824,"journal":{"name":"Data","volume":"4 11","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.3390/data8120185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Land cover classification, generated from satellite imagery through semantic segmentation, has become fundamental for monitoring land use and land cover change (LULCC). The tropical Andes territory provides opportunities due to its significance in the provision of ecosystem services. However, the lack of reliable data for this region, coupled with challenges arising from its mountainous topography and diverse ecosystems, hinders the description of its coverage. Therefore, this research proposes the Tropical Andes Land Cover Dataset (TALANDCOVER). It is constructed from three sample strategies: aleatory, minimum 50%, and 70% of representation per class, which address imbalanced geographic data. Additionally, the U-Net deep learning model is applied for enhanced and tailored classification of land covers. Using high-resolution data from the NICFI program, our analysis focuses on the Department of Antioquia in Colombia. The TALANDCOVER dataset, presented in TIF format, comprises multiband R-G-B-NIR images paired with six labels (dense forest, grasslands, heterogeneous agricultural areas, bodies of water, built-up areas, and bare-degraded lands) with an estimated 0.76 F1 score compared to ground truth data by expert knowledge and surpassing the precision of existing global cover maps for the study area. To the best of our knowledge, this work is a pioneer in its release of open-source data for segmenting coverages with pixel-wise labeled NICFI imagery at a 4.77 m resolution. The experiments carried out with the application of the sample strategies and models show F1 score values of 0.70, 0.72, and 0.74 for aleatory, balanced 50%, and balanced 70%, respectively, over the expert segmented sample (ground truth), which suggests that the personalized application of our deep learning model, together with the TALANDCOVER dataset offers different possibilities that facilitate the training of deep architectures for the classification of large-scale covers in complex areas, such as the tropical Andes. This advance has significant potential for decision making, emphasizing sustainable land use and the conservation of natural resources.