{"title":"Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models","authors":"Diogo Duarte , Cidália C. Fonte","doi":"10.1016/j.jag.2024.104272","DOIUrl":null,"url":null,"abstract":"<div><div>The localization of non-residential buildings over wide geographical areas is used as input within several contexts such as disaster management, regional and national planning, policy making and evaluation, among others. While the built-up environment has been continuously and globally mapped, given the efforts on producing synoptic land cover information; little attention has been given to the land use component of such built-up. This is due to, for example, difficulties in distinguishing built-up land use in non-commercial satellite imagery (e.g., Sentinel-2, with spatial resolution of up to 10 m), difficulties in collecting training data for supervised classification approaches, and the fact that variations in features of the built-up environment not always translate to a specific land use. This is even more critical when considering nadir viewing satellite or aerial imagery. However, map producers have been addressing this issue. For example, the Copernicus program (European Commission), through their pan-European CORINE Land Cover (CLC), and Urban Atlas restricted to several European metropolitan areas, have been making available land use information of the built-up cover, with 6-year intervals. The Global Human Settlement Layer (Copernicus program) has been providing built-up land use information by distinguishing residential from non-residential built-up since 2023 (GHSL_NRES). Currently these are also provided with a time interval of 5 years. National map producers often provide this information but usually with an interval between editions of several years. In this paper we combine readily available population counts and land cover maps to generate non-residential training labels that can be used to train a Sentinel-2 image segmentation model capable of distinguishing non-residential built-up from the remaining built-up. Leveraging two publicly available datasets, population counts (WorldPop) and built-up land cover (ESA WorldCover), allowed to produce training data from which an image segmentation model was able to learn relevant features to distinguish non-residential areas from other built-up in Sentinel-2 images. The results within a study area of 4 Sentinel-2 tiles shown that it improves the detection of non-residential built-up areas when comparing with CLC and GHSL_NRES (F1-score of 32 %, 25 % and 29 %, respectively), which are the products providing pan-European information regarding the built-up land use. These results indicate that the combination of publicly available geospatial datasets may be used to produce higher quality geospatial information.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"135 ","pages":"Article 104272"},"PeriodicalIF":7.6000,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843224006289","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0
Abstract
The localization of non-residential buildings over wide geographical areas is used as input within several contexts such as disaster management, regional and national planning, policy making and evaluation, among others. While the built-up environment has been continuously and globally mapped, given the efforts on producing synoptic land cover information; little attention has been given to the land use component of such built-up. This is due to, for example, difficulties in distinguishing built-up land use in non-commercial satellite imagery (e.g., Sentinel-2, with spatial resolution of up to 10 m), difficulties in collecting training data for supervised classification approaches, and the fact that variations in features of the built-up environment not always translate to a specific land use. This is even more critical when considering nadir viewing satellite or aerial imagery. However, map producers have been addressing this issue. For example, the Copernicus program (European Commission), through their pan-European CORINE Land Cover (CLC), and Urban Atlas restricted to several European metropolitan areas, have been making available land use information of the built-up cover, with 6-year intervals. The Global Human Settlement Layer (Copernicus program) has been providing built-up land use information by distinguishing residential from non-residential built-up since 2023 (GHSL_NRES). Currently these are also provided with a time interval of 5 years. National map producers often provide this information but usually with an interval between editions of several years. In this paper we combine readily available population counts and land cover maps to generate non-residential training labels that can be used to train a Sentinel-2 image segmentation model capable of distinguishing non-residential built-up from the remaining built-up. Leveraging two publicly available datasets, population counts (WorldPop) and built-up land cover (ESA WorldCover), allowed to produce training data from which an image segmentation model was able to learn relevant features to distinguish non-residential areas from other built-up in Sentinel-2 images. The results within a study area of 4 Sentinel-2 tiles shown that it improves the detection of non-residential built-up areas when comparing with CLC and GHSL_NRES (F1-score of 32 %, 25 % and 29 %, respectively), which are the products providing pan-European information regarding the built-up land use. These results indicate that the combination of publicly available geospatial datasets may be used to produce higher quality geospatial information.
期刊介绍:
The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.