Covariate selection approaches in spatial prediction of soil quality indices using machine learning models at the watershed scale, west of Iran

IF 6.1 1区 农林科学 Q1 SOIL SCIENCE
Marziyeh Zandi Baghche-Maryam , Mohsen Sheklabadi , Shamsollah Ayoubi
{"title":"Covariate selection approaches in spatial prediction of soil quality indices using machine learning models at the watershed scale, west of Iran","authors":"Marziyeh Zandi Baghche-Maryam ,&nbsp;Mohsen Sheklabadi ,&nbsp;Shamsollah Ayoubi","doi":"10.1016/j.still.2025.106571","DOIUrl":null,"url":null,"abstract":"<div><div>Assessing soil quality indices (SQI’s) is a fundamental approach for agricultural and natural resources as well as sustainable management practices. This study addresses the digital mapping of SQI’s, with the objective of comparing the efficacy of four variable selection methods in Hamadan Province (west of Iran). The following methods were utilized to evaluate the predictive power of three machine learning (ML) models: principal component analysis (PCA), Boruta, recursive feature elimination (RFE), and random forest (RF). The ML models include artificial neural network (ANN), random forest (RF), and Cubist algorithm. Environmental variables were extracted from the digital elevation model (DEM) and Sentinel-2 image and employed in three scenarios: (i) topographic attributes, (ii) remote sensing data, and (iii) integration of scenarios (i) and (ii). A systematic and random grid sampling method was employed to collect 150 soil samples. Surface soil samples, were collected from 0–25 cm depth in agricultural and rangeland areas. The dominant soil groups were Xerorthents, Calcixerepts, and Haploxerepts. The samples were analyzed for physical and chemical properties, and the minimum data set (MDS) was determined by applying PCA. The indicators were scored using both linear and non-linear functions, and the SQI’s were calculated using the Additive Soil Quality Index (SQIa), the Weighted Soil Quality Index (SQIw), and the Nemoro Soil Quality Index (SQIn) methods. The best performances of the SQI’s were observed for SQIn derived from MDS and TDS using linear scoring. The results showed that scenario (iii) consistently yielded the most accurate predictions. The Boruta method and Cubist algorithm produced R² and RMSE values of 0.84 and 0.023, respectively, which were the most optimal in this context. The accuracy assessment demonstrated that the Boruta method and Cubist algorithm exhibited the highest accuracy in all three scenarios for predicting SQI. The uncertainty assessment revealed that the northwestern of the studied regions exhibited a higher degree of uncertainty, which can be attributed to the high diversity in topographic and soil attributes. The findings of this study offer a framework for developing spatial-based models to generate soil quality maps at a large scale, thereby facilitating informed decision-making for the further future land use plannings.</div></div>","PeriodicalId":49503,"journal":{"name":"Soil & Tillage Research","volume":"252 ","pages":"Article 106571"},"PeriodicalIF":6.1000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soil & Tillage Research","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167198725001254","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOIL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Assessing soil quality indices (SQI’s) is a fundamental approach for agricultural and natural resources as well as sustainable management practices. This study addresses the digital mapping of SQI’s, with the objective of comparing the efficacy of four variable selection methods in Hamadan Province (west of Iran). The following methods were utilized to evaluate the predictive power of three machine learning (ML) models: principal component analysis (PCA), Boruta, recursive feature elimination (RFE), and random forest (RF). The ML models include artificial neural network (ANN), random forest (RF), and Cubist algorithm. Environmental variables were extracted from the digital elevation model (DEM) and Sentinel-2 image and employed in three scenarios: (i) topographic attributes, (ii) remote sensing data, and (iii) integration of scenarios (i) and (ii). A systematic and random grid sampling method was employed to collect 150 soil samples. Surface soil samples, were collected from 0–25 cm depth in agricultural and rangeland areas. The dominant soil groups were Xerorthents, Calcixerepts, and Haploxerepts. The samples were analyzed for physical and chemical properties, and the minimum data set (MDS) was determined by applying PCA. The indicators were scored using both linear and non-linear functions, and the SQI’s were calculated using the Additive Soil Quality Index (SQIa), the Weighted Soil Quality Index (SQIw), and the Nemoro Soil Quality Index (SQIn) methods. The best performances of the SQI’s were observed for SQIn derived from MDS and TDS using linear scoring. The results showed that scenario (iii) consistently yielded the most accurate predictions. The Boruta method and Cubist algorithm produced R² and RMSE values of 0.84 and 0.023, respectively, which were the most optimal in this context. The accuracy assessment demonstrated that the Boruta method and Cubist algorithm exhibited the highest accuracy in all three scenarios for predicting SQI. The uncertainty assessment revealed that the northwestern of the studied regions exhibited a higher degree of uncertainty, which can be attributed to the high diversity in topographic and soil attributes. The findings of this study offer a framework for developing spatial-based models to generate soil quality maps at a large scale, thereby facilitating informed decision-making for the further future land use plannings.
在伊朗西部流域尺度使用机器学习模型进行土壤质量指数空间预测的协变量选择方法
土壤质量指数评估是农业和自然资源以及可持续管理实践的基本方法。本研究解决了SQI的数字制图,目的是比较哈马丹省(伊朗西部)四种变量选择方法的有效性。采用以下方法评估三种机器学习(ML)模型的预测能力:主成分分析(PCA)、Boruta、递归特征消除(RFE)和随机森林(RF)。机器学习模型包括人工神经网络(ANN)、随机森林(RF)和立体算法。从数字高程模型(DEM)和Sentinel-2图像中提取环境变量,并将其应用于3种场景中:(i)地形属性、(ii)遥感数据和(iii)场景(i)和(ii)的整合。采用系统随机网格采样方法采集了150份土壤样本。表层土壤样品采集于农业和牧场地区0-25 cm深度。优势土壤类群为旱生、钙化和单生。对样品进行理化性质分析,并应用主成分分析法确定最小数据集(MDS)。采用线性和非线性函数对各指标进行评分,并采用可加性土壤质量指数(SQIa)、加权土壤质量指数(SQIw)和Nemoro土壤质量指数(SQIn)方法计算SQI。用线性评分法观察到由MDS和TDS衍生的SQI的最佳表现。结果表明,情景(iii)始终产生最准确的预测。Boruta方法和Cubist算法的R²和RMSE值分别为0.84和0.023,这是在这种情况下最优的。准确度评估结果表明,Boruta方法和Cubist算法在三种情景下预测SQI的准确度最高。不确定性评价结果表明,研究区西北部的不确定性程度较高,这与地形和土壤属性的多样性有关。这项研究的结果为开发基于空间的模型提供了一个框架,以生成大尺度的土壤质量地图,从而促进未来土地利用规划的明智决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Soil & Tillage Research
Soil & Tillage Research 农林科学-土壤科学
CiteScore
13.00
自引率
6.20%
发文量
266
审稿时长
5 months
期刊介绍: Soil & Tillage Research examines the physical, chemical and biological changes in the soil caused by tillage and field traffic. Manuscripts will be considered on aspects of soil science, physics, technology, mechanization and applied engineering for a sustainable balance among productivity, environmental quality and profitability. The following are examples of suitable topics within the scope of the journal of Soil and Tillage Research: The agricultural and biosystems engineering associated with tillage (including no-tillage, reduced-tillage and direct drilling), irrigation and drainage, crops and crop rotations, fertilization, rehabilitation of mine spoils and processes used to modify soils. Soil change effects on establishment and yield of crops, growth of plants and roots, structure and erosion of soil, cycling of carbon and nutrients, greenhouse gas emissions, leaching, runoff and other processes that affect environmental quality. Characterization or modeling of tillage and field traffic responses, soil, climate, or topographic effects, soil deformation processes, tillage tools, traction devices, energy requirements, economics, surface and subsurface water quality effects, tillage effects on weed, pest and disease control, and their interactions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信