Evaluating the performance of random forest, support vector machine, gradient tree boost, and CART for improved crop-type monitoring using greenest pixel composite in Google Earth Engine
{"title":"Evaluating the performance of random forest, support vector machine, gradient tree boost, and CART for improved crop-type monitoring using greenest pixel composite in Google Earth Engine","authors":"Chirasmayee Savitha, Reshma Talari","doi":"10.1007/s10661-025-13880-3","DOIUrl":null,"url":null,"abstract":"<div><p>The development of machine learning algorithms, along with high-resolution satellite datasets, aids in improved agriculture monitoring and mapping. Nevertheless, the use of high-resolution optical satellite datasets is usually constrained by clouds and shadows, which do not capture complete crop phenology, thus limiting map accuracy. Moreover, the identification of a suitable classification algorithm is essential, as the performance of each machine learning algorithm depends on input datasets, hyperparameter tuning, training, and testing samples, among other factors. To overcome the limitation of clouds and shadow in optical data, this study employs Sentinel-2 greenest pixel composite to generate a nearly accurate crop-type map for an agricultural watershed in Tadepalligudem, India. To identify a suitable machine learning model, the study also evaluates and compares the performance of four machine learning algorithms: gradient tree boost, classification and regression tree, support vector machine, and random forest (RF). Crop-type maps are generated for two cropping seasons, Kharif and Rabi, in Google Earth Engine (GEE), a robust cloud computing platform. Further, to train and test these algorithms, ground truth data is collected and divided in the ratio of 70:30, for training and testing, respectively. The results of the study demonstrated the ability of the greenest pixel composite method to identify and map crop types in small watersheds even during the Kharif season. Further, among the four machine learning algorithms employed, RF is shown to outperform other classification algorithms in both Kharif and Rabi seasons, with an average overall accuracy of 93.21% and a kappa coefficient of 0.89. Furthermore, the study showcases the potential of the cloud computing platform GEE in enhancing automatic agricultural monitoring through satellite datasets while requiring minimal computational storage and processing.</p></div>","PeriodicalId":544,"journal":{"name":"Environmental Monitoring and Assessment","volume":"197 4","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Monitoring and Assessment","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s10661-025-13880-3","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The development of machine learning algorithms, along with high-resolution satellite datasets, aids in improved agriculture monitoring and mapping. Nevertheless, the use of high-resolution optical satellite datasets is usually constrained by clouds and shadows, which do not capture complete crop phenology, thus limiting map accuracy. Moreover, the identification of a suitable classification algorithm is essential, as the performance of each machine learning algorithm depends on input datasets, hyperparameter tuning, training, and testing samples, among other factors. To overcome the limitation of clouds and shadow in optical data, this study employs Sentinel-2 greenest pixel composite to generate a nearly accurate crop-type map for an agricultural watershed in Tadepalligudem, India. To identify a suitable machine learning model, the study also evaluates and compares the performance of four machine learning algorithms: gradient tree boost, classification and regression tree, support vector machine, and random forest (RF). Crop-type maps are generated for two cropping seasons, Kharif and Rabi, in Google Earth Engine (GEE), a robust cloud computing platform. Further, to train and test these algorithms, ground truth data is collected and divided in the ratio of 70:30, for training and testing, respectively. The results of the study demonstrated the ability of the greenest pixel composite method to identify and map crop types in small watersheds even during the Kharif season. Further, among the four machine learning algorithms employed, RF is shown to outperform other classification algorithms in both Kharif and Rabi seasons, with an average overall accuracy of 93.21% and a kappa coefficient of 0.89. Furthermore, the study showcases the potential of the cloud computing platform GEE in enhancing automatic agricultural monitoring through satellite datasets while requiring minimal computational storage and processing.
期刊介绍:
Environmental Monitoring and Assessment emphasizes technical developments and data arising from environmental monitoring and assessment, the use of scientific principles in the design of monitoring systems at the local, regional and global scales, and the use of monitoring data in assessing the consequences of natural resource management actions and pollution risks to man and the environment.