预测细粒土最佳含水量和最大干重的土壤特性的交叉模型特征重要性分析

IF 3.3 2区 工程技术 Q2 ENGINEERING, GEOLOGICAL
Soils and Foundations Pub Date : 2026-04-01 Epub Date: 2026-01-21 DOI:10.1016/j.sandf.2025.101728
Harish Paneru, Netra Prakash Bhandary
{"title":"预测细粒土最佳含水量和最大干重的土壤特性的交叉模型特征重要性分析","authors":"Harish Paneru,&nbsp;Netra Prakash Bhandary","doi":"10.1016/j.sandf.2025.101728","DOIUrl":null,"url":null,"abstract":"<div><div>This study evaluates the influence of routine soil index properties on the prediction of optimum moisture content (w<sub>opt</sub>) and maximum dry unit weight (γ<sub>dmax</sub>), which are the primary outcomes of the Proctor compaction test, using machine learning (ML) methods. A curated database of fine-grained soils (n = 465, drawn from 15 sources) included gravel content (GC), sand content (SC), fines content (FC), liquid limit (LL), plastic limit (PL), plasticity index (PI), specific gravity (Gs), w<sub>opt</sub>, and γ<sub>dmax</sub>. After correlation-based feature filtering, three models were developed: Generalized Additive Model (GAM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The training used nested cross-validation with Bayesian optimization, corresponding to an overall 80–20 train-test split. The model performance was evaluated using R<sup>2</sup>, RMSE, MAE, MAPE, r, and the overfitting ratio calculated for the test set. For w<sub>opt</sub>, the best GAM model achieved R<sup>2</sup> = 0.84 and RMSE = 2.16%, outperforming RF and XGBoost. For γ<sub>dmax</sub>, the best GAM and XGBoost models reached R<sup>2</sup> = 0.79 and RMSE = 0.76 kN/m<sup>3</sup>, respectively. SHapley Additive exPlanations (SHAP), model-based importance scores, and single ablation analyses consistently identified LL and PL as the most influential predictors, and FC provided secondary contributions, while GC and Gs added little once LL and PL had been included. Moreover, paired-feature ablation confirmed the joint influence of LL and PL on the prediction. Overall, all three models predicted compaction parameters with good accuracy; however, GAM models achieved comparable or better predictive metric values than the ensembles (RF and XGBoost) while offering interpretability through plots linking soil indices with the predicted outcomes. This balance of accuracy and interpretability supports GAM as the preferred model for prediction modeling.</div></div>","PeriodicalId":21857,"journal":{"name":"Soils and Foundations","volume":"66 2","pages":"Article 101728"},"PeriodicalIF":3.3000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-model feature-importance analysis of soil properties for predicting optimum moisture content and maximum dry unit weight of fine-grained soils\",\"authors\":\"Harish Paneru,&nbsp;Netra Prakash Bhandary\",\"doi\":\"10.1016/j.sandf.2025.101728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study evaluates the influence of routine soil index properties on the prediction of optimum moisture content (w<sub>opt</sub>) and maximum dry unit weight (γ<sub>dmax</sub>), which are the primary outcomes of the Proctor compaction test, using machine learning (ML) methods. A curated database of fine-grained soils (n = 465, drawn from 15 sources) included gravel content (GC), sand content (SC), fines content (FC), liquid limit (LL), plastic limit (PL), plasticity index (PI), specific gravity (Gs), w<sub>opt</sub>, and γ<sub>dmax</sub>. After correlation-based feature filtering, three models were developed: Generalized Additive Model (GAM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The training used nested cross-validation with Bayesian optimization, corresponding to an overall 80–20 train-test split. The model performance was evaluated using R<sup>2</sup>, RMSE, MAE, MAPE, r, and the overfitting ratio calculated for the test set. For w<sub>opt</sub>, the best GAM model achieved R<sup>2</sup> = 0.84 and RMSE = 2.16%, outperforming RF and XGBoost. For γ<sub>dmax</sub>, the best GAM and XGBoost models reached R<sup>2</sup> = 0.79 and RMSE = 0.76 kN/m<sup>3</sup>, respectively. SHapley Additive exPlanations (SHAP), model-based importance scores, and single ablation analyses consistently identified LL and PL as the most influential predictors, and FC provided secondary contributions, while GC and Gs added little once LL and PL had been included. Moreover, paired-feature ablation confirmed the joint influence of LL and PL on the prediction. Overall, all three models predicted compaction parameters with good accuracy; however, GAM models achieved comparable or better predictive metric values than the ensembles (RF and XGBoost) while offering interpretability through plots linking soil indices with the predicted outcomes. This balance of accuracy and interpretability supports GAM as the preferred model for prediction modeling.</div></div>\",\"PeriodicalId\":21857,\"journal\":{\"name\":\"Soils and Foundations\",\"volume\":\"66 2\",\"pages\":\"Article 101728\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2026-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soils and Foundations\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0038080625001623\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2026/1/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, GEOLOGICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soils and Foundations","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0038080625001623","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/21 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ENGINEERING, GEOLOGICAL","Score":null,"Total":0}
引用次数: 0

摘要

本研究利用机器学习(ML)方法,评估了常规土壤指标特性对Proctor压实试验主要结果——最佳含水量(wopt)和最大干重(γdmax)预测的影响。一个精心设计的细粒土壤数据库(n = 465,来自15个来源)包括砾石含量(GC)、砂含量(SC)、细粒含量(FC)、液体极限(LL)、塑性极限(PL)、塑性指数(PI)、比重(Gs)、wopt和γdmax。经过相关特征滤波,建立了广义加性模型(GAM)、随机森林模型(RF)和极端梯度增强模型(XGBoost)。训练使用嵌套交叉验证与贝叶斯优化,对应于整体80-20训练测试分割。使用R2、RMSE、MAE、MAPE、r和计算测试集的过拟合比率来评估模型的性能。对于wopt,最佳GAM模型的R2 = 0.84, RMSE = 2.16%,优于RF和XGBoost。对于γ - dmax, GAM模型和XGBoost模型分别达到R2 = 0.79和RMSE = 0.76 kN/m3。SHapley加性解释(SHAP)、基于模型的重要性评分和单一消融分析一致认为LL和PL是最具影响力的预测因子,FC提供了次要贡献,而GC和Gs在包括LL和PL后几乎没有添加。此外,配对特征消融证实了LL和PL对预测的共同影响。总体而言,三种模型对压实参数的预测精度均较高;然而,GAM模型获得了与集成(RF和XGBoost)相当或更好的预测度量值,同时通过将土壤指数与预测结果联系起来的图提供了可解释性。这种准确性和可解释性的平衡支持GAM作为预测建模的首选模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cross-model feature-importance analysis of soil properties for predicting optimum moisture content and maximum dry unit weight of fine-grained soils
This study evaluates the influence of routine soil index properties on the prediction of optimum moisture content (wopt) and maximum dry unit weight (γdmax), which are the primary outcomes of the Proctor compaction test, using machine learning (ML) methods. A curated database of fine-grained soils (n = 465, drawn from 15 sources) included gravel content (GC), sand content (SC), fines content (FC), liquid limit (LL), plastic limit (PL), plasticity index (PI), specific gravity (Gs), wopt, and γdmax. After correlation-based feature filtering, three models were developed: Generalized Additive Model (GAM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The training used nested cross-validation with Bayesian optimization, corresponding to an overall 80–20 train-test split. The model performance was evaluated using R2, RMSE, MAE, MAPE, r, and the overfitting ratio calculated for the test set. For wopt, the best GAM model achieved R2 = 0.84 and RMSE = 2.16%, outperforming RF and XGBoost. For γdmax, the best GAM and XGBoost models reached R2 = 0.79 and RMSE = 0.76 kN/m3, respectively. SHapley Additive exPlanations (SHAP), model-based importance scores, and single ablation analyses consistently identified LL and PL as the most influential predictors, and FC provided secondary contributions, while GC and Gs added little once LL and PL had been included. Moreover, paired-feature ablation confirmed the joint influence of LL and PL on the prediction. Overall, all three models predicted compaction parameters with good accuracy; however, GAM models achieved comparable or better predictive metric values than the ensembles (RF and XGBoost) while offering interpretability through plots linking soil indices with the predicted outcomes. This balance of accuracy and interpretability supports GAM as the preferred model for prediction modeling.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Soils and Foundations
Soils and Foundations 工程技术-地球科学综合
CiteScore
6.40
自引率
8.10%
发文量
99
审稿时长
5 months
期刊介绍: Soils and Foundations is one of the leading journals in the field of soil mechanics and geotechnical engineering. It is the official journal of the Japanese Geotechnical Society (JGS)., The journal publishes a variety of original research paper, technical reports, technical notes, as well as the state-of-the-art reports upon invitation by the Editor, in the fields of soil and rock mechanics, geotechnical engineering, and environmental geotechnics. Since the publication of Volume 1, No.1 issue in June 1960, Soils and Foundations will celebrate the 60th anniversary in the year of 2020. Soils and Foundations welcomes theoretical as well as practical work associated with the aforementioned field(s). Case studies that describe the original and interdisciplinary work applicable to geotechnical engineering are particularly encouraged. Discussions to each of the published articles are also welcomed in order to provide an avenue in which opinions of peers may be fed back or exchanged. In providing latest expertise on a specific topic, one issue out of six per year on average was allocated to include selected papers from the International Symposia which were held in Japan as well as overseas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书