A hybrid machine learning approach to identify potential green cover area for bio–physical suitability mapping in the western semi–arid Rarh region of West Bengal, Purulia

IF 3 4区 环境科学与生态学 Q3 ENVIRONMENTAL SCIENCES
Bikash Manna, Shweta Rani
{"title":"A hybrid machine learning approach to identify potential green cover area for bio–physical suitability mapping in the western semi–arid Rarh region of West Bengal, Purulia","authors":"Bikash Manna,&nbsp;Shweta Rani","doi":"10.1007/s10661-026-15404-z","DOIUrl":null,"url":null,"abstract":"<div><p>Forest cover restoration is urgently needed in a semi–arid district of West Bengal, where land degradation endangers environmental stability and community welfare. The present study introduces and validates a robust, data–driven framework using machine learning to isolate optimal sites for afforestation, aiming to enhance climate adaptability and create sustainable, forest–centric livelihood opportunities. The methodology is structured as a sequential, hybrid workflow. Initially, an unsupervised K–Means clustering algorithm was applied to a suite of eleven environmental variables derived from SRTM, Landsat, and national geospatial databases to perform an exploratory delineation of potential zones. This was followed by a meticulous training data generation were manually digitized through high–resolution visual validation on Google Earth Pro. This dataset then served as the basis for training two supervised algorithms: RF and XGBoost. A rigorous comparative evaluation confirmed the superior predictive power of the Random Forest model, which achieved an overall accuracy of 89.1% and Area Under the ROC Curve (AUC) of 0.9508. An interpretability analysis using SHAP further revealed that slope, soil moisture, and elevation were the most critical determinants of suitable area. The primary outcome is spatially explicit suitability map with 20.9% area of the district as potentially suitable for afforestation that serves as a decision–support tool, enabling policymakers and community stakeholders to implement strategic and effective afforestation programs in the study area.</p></div>","PeriodicalId":544,"journal":{"name":"Environmental Monitoring and Assessment","volume":"198 6","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Monitoring and Assessment","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s10661-026-15404-z","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Forest cover restoration is urgently needed in a semi–arid district of West Bengal, where land degradation endangers environmental stability and community welfare. The present study introduces and validates a robust, data–driven framework using machine learning to isolate optimal sites for afforestation, aiming to enhance climate adaptability and create sustainable, forest–centric livelihood opportunities. The methodology is structured as a sequential, hybrid workflow. Initially, an unsupervised K–Means clustering algorithm was applied to a suite of eleven environmental variables derived from SRTM, Landsat, and national geospatial databases to perform an exploratory delineation of potential zones. This was followed by a meticulous training data generation were manually digitized through high–resolution visual validation on Google Earth Pro. This dataset then served as the basis for training two supervised algorithms: RF and XGBoost. A rigorous comparative evaluation confirmed the superior predictive power of the Random Forest model, which achieved an overall accuracy of 89.1% and Area Under the ROC Curve (AUC) of 0.9508. An interpretability analysis using SHAP further revealed that slope, soil moisture, and elevation were the most critical determinants of suitable area. The primary outcome is spatially explicit suitability map with 20.9% area of the district as potentially suitable for afforestation that serves as a decision–support tool, enabling policymakers and community stakeholders to implement strategic and effective afforestation programs in the study area.

Abstract Image

一种混合机器学习方法,用于识别普鲁里亚西孟加拉邦西部半干旱地区潜在的绿色覆盖面积,用于生物物理适宜性制图
西孟加拉邦半干旱地区迫切需要恢复森林覆盖,土地退化危及环境稳定和社区福利。本研究介绍并验证了一个强大的、数据驱动的框架,该框架使用机器学习来隔离植树造林的最佳地点,旨在提高气候适应性,创造可持续的、以森林为中心的生计机会。该方法的结构是一个顺序的混合工作流。首先,将无监督K-Means聚类算法应用于来自SRTM、Landsat和国家地理空间数据库的11个环境变量,以进行潜在区域的探索性划分。接下来是细致的训练数据生成,通过谷歌Earth Pro上的高分辨率视觉验证手动数字化。然后,该数据集作为训练两种监督算法的基础:RF和XGBoost。经过严格的对比评估,随机森林模型的预测能力较好,总体准确率为89.1%,ROC曲线下面积(AUC)为0.9508。利用SHAP进行的可解释性分析进一步表明,坡度、土壤湿度和海拔是适宜面积的最关键决定因素。研究结果显示,该地区20.9%的面积具有潜在的植树适宜性,可作为决策支持工具,使政策制定者和社区利益相关者能够在研究区域实施战略和有效的植树造林计划。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Environmental Monitoring and Assessment
Environmental Monitoring and Assessment 环境科学-环境科学
CiteScore
4.70
自引率
6.70%
发文量
1000
审稿时长
7.3 months
期刊介绍: Environmental Monitoring and Assessment emphasizes technical developments and data arising from environmental monitoring and assessment, the use of scientific principles in the design of monitoring systems at the local, regional and global scales, and the use of monitoring data in assessing the consequences of natural resource management actions and pollution risks to man and the environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书