Hyperspectral Image Classification Using Random Forest and Deep Learning Algorithms

2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS) Pub Date : 2020-03-01 DOI:10.1109/LAGIRS48042.2020.9165588

J. V. Rissati, P. C. Molina, C. S. Anjos

{"title":"Hyperspectral Image Classification Using Random Forest and Deep Learning Algorithms","authors":"J. V. Rissati, P. C. Molina, C. S. Anjos","doi":"10.1109/LAGIRS48042.2020.9165588","DOIUrl":null,"url":null,"abstract":"One of the purposes of hyperspectral remote sensing is to differentiate and identify the materials present on the Earth’s surface by the spectral behavior of each object in the different regions of the electromagnetic spectrum. Such differentiation and identification can be accomplished through different image classification algorithms. However, there is no perfect classifier, since every algorithm has labeling errors. With the advent of orbital and aerial images of very high spatial and spectral resolution, the recognition of the materials present in urban environments is increasingly accurate. Thus, we thoroughly study different methodologies to identify the algorithm that presents the best results in the characterization of urban objects. The hyperspectral image used in the present study represents an area over Houston University - Texas and its surroundings, containing 48 spectral bands, with a spatial resolution of 1 meter and spectral range of 380 nm to 1050 nm. For the identification of 21 classes present in the study area, this paper analyzes two different classification methods: Deep Learning and Random Forest. To improve classification accuracy, performed the feature extraction. To obtain such preliminary results we used tools available in specific software as Normalized Difference Vegetation Index (NDVI), Minimum Noise Fraction (MNF), Principal Component Analysis (PCA) and Soil Adjusted Vegetation Index (SAVI). The image segmentation was performed using two different methods known as Multiresolution Segmentation and Spectral difference. Multiresolution segmentation needs parameters related to form and compactness. The best results were obtained with the values of form = 0.7 and compactness = 0.5, besides the scale of 10. From this, samples of all classes contained in the study area were selected for the training of the algorithms. This step is of paramount importance, as sample collection directly impacts the result of the classifications. After performing these steps, the information obtained from sample collection is entered into the data mining software (WEKA 3.8) to train the classification algorithms. The analysis of the results was performed by cross-validation, thus obtained the confusion matrix, calculated the Overall Accuracy (OA) and Kappa Index. The classification by the Random Forest method had an overall accuracy of 84.72% and a Kappa Index of 0.83. In turn, the Deep Learning algorithm had an overall accuracy of 81.32% and a Kappa index of 0.80. In this case, the classification by the Random Forest method presented better results for the hyperspectral image classification than the Deep Learning method. The accuracy difference obtained between the methods is not considered significant, so it is suggested for future work to analyze other complementary issues such as processing time.","PeriodicalId":111863,"journal":{"name":"2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LAGIRS48042.2020.9165588","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

One of the purposes of hyperspectral remote sensing is to differentiate and identify the materials present on the Earth’s surface by the spectral behavior of each object in the different regions of the electromagnetic spectrum. Such differentiation and identification can be accomplished through different image classification algorithms. However, there is no perfect classifier, since every algorithm has labeling errors. With the advent of orbital and aerial images of very high spatial and spectral resolution, the recognition of the materials present in urban environments is increasingly accurate. Thus, we thoroughly study different methodologies to identify the algorithm that presents the best results in the characterization of urban objects. The hyperspectral image used in the present study represents an area over Houston University - Texas and its surroundings, containing 48 spectral bands, with a spatial resolution of 1 meter and spectral range of 380 nm to 1050 nm. For the identification of 21 classes present in the study area, this paper analyzes two different classification methods: Deep Learning and Random Forest. To improve classification accuracy, performed the feature extraction. To obtain such preliminary results we used tools available in specific software as Normalized Difference Vegetation Index (NDVI), Minimum Noise Fraction (MNF), Principal Component Analysis (PCA) and Soil Adjusted Vegetation Index (SAVI). The image segmentation was performed using two different methods known as Multiresolution Segmentation and Spectral difference. Multiresolution segmentation needs parameters related to form and compactness. The best results were obtained with the values of form = 0.7 and compactness = 0.5, besides the scale of 10. From this, samples of all classes contained in the study area were selected for the training of the algorithms. This step is of paramount importance, as sample collection directly impacts the result of the classifications. After performing these steps, the information obtained from sample collection is entered into the data mining software (WEKA 3.8) to train the classification algorithms. The analysis of the results was performed by cross-validation, thus obtained the confusion matrix, calculated the Overall Accuracy (OA) and Kappa Index. The classification by the Random Forest method had an overall accuracy of 84.72% and a Kappa Index of 0.83. In turn, the Deep Learning algorithm had an overall accuracy of 81.32% and a Kappa index of 0.80. In this case, the classification by the Random Forest method presented better results for the hyperspectral image classification than the Deep Learning method. The accuracy difference obtained between the methods is not considered significant, so it is suggested for future work to analyze other complementary issues such as processing time.

查看原文本刊更多论文

基于随机森林和深度学习算法的高光谱图像分类

高光谱遥感的目的之一是通过每个物体在电磁波谱不同区域的光谱行为来区分和识别地球表面上存在的物质。这种区分和识别可以通过不同的图像分类算法来完成。然而，没有完美的分类器，因为每个算法都有标记错误。随着非常高空间和光谱分辨率的轨道和航空图像的出现，对城市环境中存在的物质的识别越来越准确。因此，我们深入研究了不同的方法，以确定在城市对象表征中呈现最佳结果的算法。本研究中使用的高光谱图像代表了德克萨斯州休斯顿大学及其周边地区，包含48个光谱波段，空间分辨率为1米，光谱范围为380 nm至1050 nm。为了识别研究区域中存在的21个类，本文分析了两种不同的分类方法:深度学习和随机森林。为了提高分类精度，进行了特征提取。为了获得这样的初步结果，我们使用了特定软件中可用的工具，如归一化植被指数(NDVI)、最小噪声分数(MNF)、主成分分析(PCA)和土壤调整植被指数(SAVI)。图像分割使用两种不同的方法进行，即多分辨率分割和光谱差分。多分辨率分割需要与形状和紧密度相关的参数。在10分制的基础上，成形= 0.7、密实度= 0.5时效果最佳。从中选择研究区域中包含的所有类的样本进行算法的训练。这一步是至关重要的，因为样本收集直接影响分类的结果。完成这些步骤后，将样本收集得到的信息输入到数据挖掘软件(WEKA 3.8)中训练分类算法。对结果进行交叉验证分析，得到混淆矩阵，计算总体准确率(Overall Accuracy, OA)和Kappa指数。随机森林分类的总体准确率为84.72%，Kappa指数为0.83。反过来，深度学习算法的总体准确率为81.32%，Kappa指数为0.80。在这种情况下，Random Forest方法对高光谱图像的分类效果优于Deep Learning方法。两种方法之间的精度差异并不显著，因此建议在未来的工作中分析处理时间等其他互补问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS)

自引率

0.00%

发文量