Rahat Tufail, Patrizia Tassinari, Daniele Torreggiani
{"title":"Assessing feature extraction, selection, and classification combinations for crop mapping using Sentinel-2 time series: A case study in northern Italy","authors":"Rahat Tufail, Patrizia Tassinari, Daniele Torreggiani","doi":"10.1016/j.rsase.2025.101525","DOIUrl":null,"url":null,"abstract":"<div><div>Rural areas need constant monitoring to ensure sustainable farming and respond to environmental and climatic impacts. Over the last few decades, remote sensing data have been extensively used in agricultural monitoring, allowing cost-effective and efficient crop management. Selecting the suitable data combinations for crop mapping while reducing dimensionality and redundancy to speed up processing remains a challenge. This study address the challenges by testing and assessing the efficiency of various combinations of feature extraction, feature selection, and feature classification methods. We used Sentinel-2 time series data, which focused on spectral features and derived vegetation indices. Particularly, the red-edge indices which are critical for crop discrimination. To select the optimal data for classifiers, we have tested two feature selection techniques: Random Forest and Principal Component Analysis, for both spectral bands and vegetational indices. Then, we have employed three machine learning algorithms: Extreme Gradient Boost (XGB), Random Forest (RF), and Support Vector Machine (SVM) along with one deep learning approach, Pixel-Set Encoders and Temporal Self-Attention (PSETAE), to evaluate the datasets. The results suggest that the most effective feature set is the Sentinel-2 spectral bands selected by the Random Forest feature selection method. The results obtained from the previous step achieve the highest overall accuracy with the XGB classifier and outperform the RF, SVM, and PSETAE classifiers. Further, quantitative analysis of overall classification accuracies showed that Random Forest is the second-best performing classifier, and PSETAE classifier produced the lowest results for all data models.</div></div>","PeriodicalId":53227,"journal":{"name":"Remote Sensing Applications-Society and Environment","volume":"38 ","pages":"Article 101525"},"PeriodicalIF":3.8000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing Applications-Society and Environment","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352938525000783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Rural areas need constant monitoring to ensure sustainable farming and respond to environmental and climatic impacts. Over the last few decades, remote sensing data have been extensively used in agricultural monitoring, allowing cost-effective and efficient crop management. Selecting the suitable data combinations for crop mapping while reducing dimensionality and redundancy to speed up processing remains a challenge. This study address the challenges by testing and assessing the efficiency of various combinations of feature extraction, feature selection, and feature classification methods. We used Sentinel-2 time series data, which focused on spectral features and derived vegetation indices. Particularly, the red-edge indices which are critical for crop discrimination. To select the optimal data for classifiers, we have tested two feature selection techniques: Random Forest and Principal Component Analysis, for both spectral bands and vegetational indices. Then, we have employed three machine learning algorithms: Extreme Gradient Boost (XGB), Random Forest (RF), and Support Vector Machine (SVM) along with one deep learning approach, Pixel-Set Encoders and Temporal Self-Attention (PSETAE), to evaluate the datasets. The results suggest that the most effective feature set is the Sentinel-2 spectral bands selected by the Random Forest feature selection method. The results obtained from the previous step achieve the highest overall accuracy with the XGB classifier and outperform the RF, SVM, and PSETAE classifiers. Further, quantitative analysis of overall classification accuracies showed that Random Forest is the second-best performing classifier, and PSETAE classifier produced the lowest results for all data models.
期刊介绍:
The journal ''Remote Sensing Applications: Society and Environment'' (RSASE) focuses on remote sensing studies that address specific topics with an emphasis on environmental and societal issues - regional / local studies with global significance. Subjects are encouraged to have an interdisciplinary approach and include, but are not limited by: " -Global and climate change studies addressing the impact of increasing concentrations of greenhouse gases, CO2 emission, carbon balance and carbon mitigation, energy system on social and environmental systems -Ecological and environmental issues including biodiversity, ecosystem dynamics, land degradation, atmospheric and water pollution, urban footprint, ecosystem management and natural hazards (e.g. earthquakes, typhoons, floods, landslides) -Natural resource studies including land-use in general, biomass estimation, forests, agricultural land, plantation, soils, coral reefs, wetland and water resources -Agriculture, food production systems and food security outcomes -Socio-economic issues including urban systems, urban growth, public health, epidemics, land-use transition and land use conflicts -Oceanography and coastal zone studies, including sea level rise projections, coastlines changes and the ocean-land interface -Regional challenges for remote sensing application techniques, monitoring and analysis, such as cloud screening and atmospheric correction for tropical regions -Interdisciplinary studies combining remote sensing, household survey data, field measurements and models to address environmental, societal and sustainability issues -Quantitative and qualitative analysis that documents the impact of using remote sensing studies in social, political, environmental or economic systems