A hybrid machine learning approach to identify potential green cover area for bio–physical suitability mapping in the western semi–arid Rarh region of West Bengal, Purulia
{"title":"A hybrid machine learning approach to identify potential green cover area for bio–physical suitability mapping in the western semi–arid Rarh region of West Bengal, Purulia","authors":"Bikash Manna, Shweta Rani","doi":"10.1007/s10661-026-15404-z","DOIUrl":null,"url":null,"abstract":"<div><p>Forest cover restoration is urgently needed in a semi–arid district of West Bengal, where land degradation endangers environmental stability and community welfare. The present study introduces and validates a robust, data–driven framework using machine learning to isolate optimal sites for afforestation, aiming to enhance climate adaptability and create sustainable, forest–centric livelihood opportunities. The methodology is structured as a sequential, hybrid workflow. Initially, an unsupervised K–Means clustering algorithm was applied to a suite of eleven environmental variables derived from SRTM, Landsat, and national geospatial databases to perform an exploratory delineation of potential zones. This was followed by a meticulous training data generation were manually digitized through high–resolution visual validation on Google Earth Pro. This dataset then served as the basis for training two supervised algorithms: RF and XGBoost. A rigorous comparative evaluation confirmed the superior predictive power of the Random Forest model, which achieved an overall accuracy of 89.1% and Area Under the ROC Curve (AUC) of 0.9508. An interpretability analysis using SHAP further revealed that slope, soil moisture, and elevation were the most critical determinants of suitable area. The primary outcome is spatially explicit suitability map with 20.9% area of the district as potentially suitable for afforestation that serves as a decision–support tool, enabling policymakers and community stakeholders to implement strategic and effective afforestation programs in the study area.</p></div>","PeriodicalId":544,"journal":{"name":"Environmental Monitoring and Assessment","volume":"198 6","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Monitoring and Assessment","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s10661-026-15404-z","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Forest cover restoration is urgently needed in a semi–arid district of West Bengal, where land degradation endangers environmental stability and community welfare. The present study introduces and validates a robust, data–driven framework using machine learning to isolate optimal sites for afforestation, aiming to enhance climate adaptability and create sustainable, forest–centric livelihood opportunities. The methodology is structured as a sequential, hybrid workflow. Initially, an unsupervised K–Means clustering algorithm was applied to a suite of eleven environmental variables derived from SRTM, Landsat, and national geospatial databases to perform an exploratory delineation of potential zones. This was followed by a meticulous training data generation were manually digitized through high–resolution visual validation on Google Earth Pro. This dataset then served as the basis for training two supervised algorithms: RF and XGBoost. A rigorous comparative evaluation confirmed the superior predictive power of the Random Forest model, which achieved an overall accuracy of 89.1% and Area Under the ROC Curve (AUC) of 0.9508. An interpretability analysis using SHAP further revealed that slope, soil moisture, and elevation were the most critical determinants of suitable area. The primary outcome is spatially explicit suitability map with 20.9% area of the district as potentially suitable for afforestation that serves as a decision–support tool, enabling policymakers and community stakeholders to implement strategic and effective afforestation programs in the study area.
期刊介绍:
Environmental Monitoring and Assessment emphasizes technical developments and data arising from environmental monitoring and assessment, the use of scientific principles in the design of monitoring systems at the local, regional and global scales, and the use of monitoring data in assessing the consequences of natural resource management actions and pollution risks to man and the environment.