{"title":"一种用于地面高光谱图像分析的特征选择方法","authors":"Kyle Loggenberg, Nitesh K. Poona","doi":"10.4314/sajg.v9i2.20","DOIUrl":null,"url":null,"abstract":"Feature selection techniques are often employed for reducing data dimensionality, improving computational efficiency, and most importantly for selecting a subset of the most important features for model building. The present study explored the utility of a Filter-Wrapper (FW) approach for feature selection using terrestrial hyperspectral remote sensing imagery. The efficacy of the FW approach was evaluated in conjunction with the Random Forest (RF) and Extreme Gradient Boosting (XGBoost) classifiers, to discriminate between water-stressed and non-stressed Shiraz vines. The proposed FW approach yielded a test accuracy of 80.0% (KHAT = 0.6) for both RF and XGBoost, outperforming the more traditional Kruskal-Wallis (KW) filter by more than 20%. The FW approach was also less computationally expensive when compared with the more commonly used Sequential Floating Forward Selection (SFFS) wrapper. Additionally, we examined the effect of hyperparameter optimisation on classification accuracy and computational expense. The results showed that RF marginally outperformed XGBoost when using all wavebands (p = 176) and optimised hyperparameter values. RF yielded a test accuracy of 83.3% (KHAT = 0.67), whereas XGBoost yielded a test accuracy of 81.7% (KHAT = 0.63). Our results further show that optimising hyperparameter values yielded an overall increase in test accuracy, ranging from 0.8% to 5.0%, for both RF and XGBoost. Overall, the results highlight the effect of feature selection and optimisation on the performance of machine learning ensembles for modelling vineyard water stress.","PeriodicalId":43854,"journal":{"name":"South African Journal of Geomatics","volume":null,"pages":null},"PeriodicalIF":0.3000,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A feature selection approach for terrestrial hyperspectral image analysis\",\"authors\":\"Kyle Loggenberg, Nitesh K. Poona\",\"doi\":\"10.4314/sajg.v9i2.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection techniques are often employed for reducing data dimensionality, improving computational efficiency, and most importantly for selecting a subset of the most important features for model building. The present study explored the utility of a Filter-Wrapper (FW) approach for feature selection using terrestrial hyperspectral remote sensing imagery. The efficacy of the FW approach was evaluated in conjunction with the Random Forest (RF) and Extreme Gradient Boosting (XGBoost) classifiers, to discriminate between water-stressed and non-stressed Shiraz vines. The proposed FW approach yielded a test accuracy of 80.0% (KHAT = 0.6) for both RF and XGBoost, outperforming the more traditional Kruskal-Wallis (KW) filter by more than 20%. The FW approach was also less computationally expensive when compared with the more commonly used Sequential Floating Forward Selection (SFFS) wrapper. Additionally, we examined the effect of hyperparameter optimisation on classification accuracy and computational expense. The results showed that RF marginally outperformed XGBoost when using all wavebands (p = 176) and optimised hyperparameter values. RF yielded a test accuracy of 83.3% (KHAT = 0.67), whereas XGBoost yielded a test accuracy of 81.7% (KHAT = 0.63). Our results further show that optimising hyperparameter values yielded an overall increase in test accuracy, ranging from 0.8% to 5.0%, for both RF and XGBoost. Overall, the results highlight the effect of feature selection and optimisation on the performance of machine learning ensembles for modelling vineyard water stress.\",\"PeriodicalId\":43854,\"journal\":{\"name\":\"South African Journal of Geomatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.3000,\"publicationDate\":\"2020-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"South African Journal of Geomatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4314/sajg.v9i2.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"South African Journal of Geomatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4314/sajg.v9i2.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"REMOTE SENSING","Score":null,"Total":0}
A feature selection approach for terrestrial hyperspectral image analysis
Feature selection techniques are often employed for reducing data dimensionality, improving computational efficiency, and most importantly for selecting a subset of the most important features for model building. The present study explored the utility of a Filter-Wrapper (FW) approach for feature selection using terrestrial hyperspectral remote sensing imagery. The efficacy of the FW approach was evaluated in conjunction with the Random Forest (RF) and Extreme Gradient Boosting (XGBoost) classifiers, to discriminate between water-stressed and non-stressed Shiraz vines. The proposed FW approach yielded a test accuracy of 80.0% (KHAT = 0.6) for both RF and XGBoost, outperforming the more traditional Kruskal-Wallis (KW) filter by more than 20%. The FW approach was also less computationally expensive when compared with the more commonly used Sequential Floating Forward Selection (SFFS) wrapper. Additionally, we examined the effect of hyperparameter optimisation on classification accuracy and computational expense. The results showed that RF marginally outperformed XGBoost when using all wavebands (p = 176) and optimised hyperparameter values. RF yielded a test accuracy of 83.3% (KHAT = 0.67), whereas XGBoost yielded a test accuracy of 81.7% (KHAT = 0.63). Our results further show that optimising hyperparameter values yielded an overall increase in test accuracy, ranging from 0.8% to 5.0%, for both RF and XGBoost. Overall, the results highlight the effect of feature selection and optimisation on the performance of machine learning ensembles for modelling vineyard water stress.