Robert Martin C. Santiago, R. Gustilo, G. Arada, E. Magsino, E. Sybingco
{"title":"Performance Analysis of Machine Learning Algorithms in Generating Urban Land Cover Map of Quezon City, Philippines Using Sentinel-2 Satellite Imagery","authors":"Robert Martin C. Santiago, R. Gustilo, G. Arada, E. Magsino, E. Sybingco","doi":"10.1109/HNICEM54116.2021.9731856","DOIUrl":null,"url":null,"abstract":"As urban expansion is expected to persist and may even accelerate in the coming years, understanding and effectively managing urbanization become increasingly important in achieving long-term progress specifically in making cities and human settlements inclusive, safe, resilient, and sustainable. One way to accomplish these is to obtain reliable and updated information about the land cover characteristics of an area in the form of a map which can be done using remote sensing and machine learning. However, the practice of using these technologies for urban land cover mapping was observed to occur in the geographic locality level, and in the case of the Philippines, this is a domain that needs to be further explored to quantitatively comprehend urban extent. In this study, a map of man-made structures or built-up areas and natural structures or nonbuilt-up areas was generated over Quezon City and nearby surrounding areas where rapid rise in population occurs along with urban development. In addition, since related previous studies used various machine learning algorithms in doing the classification, this study compared the performances of three algorithms specifically random forest classifier, k-nearest neighbors, and Gaussian mixture model to identify which performed best in this particular application. The satellite imagery of the area of interest was collected from the Sentinel-2 mission satellites. All the three algorithms attained high accuracies across all measurements with small variations but greatly differed in the time consumed doing the classification. The highest over-all accuracy of 99.32% was obtained using random forest classifier despite taking the longest time to finish the classification, next is 98.95% using the k-nearest neighbors algorithm which also ranked second in terms of speed of classification, and last is 97.17% using the Gaussian mixture model despite being the fastest to complete the classification. Further studies may explore other machine learning algorithms as well as deep learning techniques to harness their capabilities in feature extraction for more complex applications. Aside from Sentinel-2, other satellite missions may also be utilized as sources of satellite imageries which can offer different spectral, spatial, and temporal resolutions that would fit a specific application.","PeriodicalId":129868,"journal":{"name":"2021 IEEE 13th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)","volume":"174 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 13th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HNICEM54116.2021.9731856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As urban expansion is expected to persist and may even accelerate in the coming years, understanding and effectively managing urbanization become increasingly important in achieving long-term progress specifically in making cities and human settlements inclusive, safe, resilient, and sustainable. One way to accomplish these is to obtain reliable and updated information about the land cover characteristics of an area in the form of a map which can be done using remote sensing and machine learning. However, the practice of using these technologies for urban land cover mapping was observed to occur in the geographic locality level, and in the case of the Philippines, this is a domain that needs to be further explored to quantitatively comprehend urban extent. In this study, a map of man-made structures or built-up areas and natural structures or nonbuilt-up areas was generated over Quezon City and nearby surrounding areas where rapid rise in population occurs along with urban development. In addition, since related previous studies used various machine learning algorithms in doing the classification, this study compared the performances of three algorithms specifically random forest classifier, k-nearest neighbors, and Gaussian mixture model to identify which performed best in this particular application. The satellite imagery of the area of interest was collected from the Sentinel-2 mission satellites. All the three algorithms attained high accuracies across all measurements with small variations but greatly differed in the time consumed doing the classification. The highest over-all accuracy of 99.32% was obtained using random forest classifier despite taking the longest time to finish the classification, next is 98.95% using the k-nearest neighbors algorithm which also ranked second in terms of speed of classification, and last is 97.17% using the Gaussian mixture model despite being the fastest to complete the classification. Further studies may explore other machine learning algorithms as well as deep learning techniques to harness their capabilities in feature extraction for more complex applications. Aside from Sentinel-2, other satellite missions may also be utilized as sources of satellite imageries which can offer different spectral, spatial, and temporal resolutions that would fit a specific application.