{"title":"预测细粒土最佳含水量和最大干重的土壤特性的交叉模型特征重要性分析","authors":"Harish Paneru, Netra Prakash Bhandary","doi":"10.1016/j.sandf.2025.101728","DOIUrl":null,"url":null,"abstract":"<div><div>This study evaluates the influence of routine soil index properties on the prediction of optimum moisture content (w<sub>opt</sub>) and maximum dry unit weight (γ<sub>dmax</sub>), which are the primary outcomes of the Proctor compaction test, using machine learning (ML) methods. A curated database of fine-grained soils (n = 465, drawn from 15 sources) included gravel content (GC), sand content (SC), fines content (FC), liquid limit (LL), plastic limit (PL), plasticity index (PI), specific gravity (Gs), w<sub>opt</sub>, and γ<sub>dmax</sub>. After correlation-based feature filtering, three models were developed: Generalized Additive Model (GAM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The training used nested cross-validation with Bayesian optimization, corresponding to an overall 80–20 train-test split. The model performance was evaluated using R<sup>2</sup>, RMSE, MAE, MAPE, r, and the overfitting ratio calculated for the test set. For w<sub>opt</sub>, the best GAM model achieved R<sup>2</sup> = 0.84 and RMSE = 2.16%, outperforming RF and XGBoost. For γ<sub>dmax</sub>, the best GAM and XGBoost models reached R<sup>2</sup> = 0.79 and RMSE = 0.76 kN/m<sup>3</sup>, respectively. SHapley Additive exPlanations (SHAP), model-based importance scores, and single ablation analyses consistently identified LL and PL as the most influential predictors, and FC provided secondary contributions, while GC and Gs added little once LL and PL had been included. Moreover, paired-feature ablation confirmed the joint influence of LL and PL on the prediction. Overall, all three models predicted compaction parameters with good accuracy; however, GAM models achieved comparable or better predictive metric values than the ensembles (RF and XGBoost) while offering interpretability through plots linking soil indices with the predicted outcomes. This balance of accuracy and interpretability supports GAM as the preferred model for prediction modeling.</div></div>","PeriodicalId":21857,"journal":{"name":"Soils and Foundations","volume":"66 2","pages":"Article 101728"},"PeriodicalIF":3.3000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-model feature-importance analysis of soil properties for predicting optimum moisture content and maximum dry unit weight of fine-grained soils\",\"authors\":\"Harish Paneru, Netra Prakash Bhandary\",\"doi\":\"10.1016/j.sandf.2025.101728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study evaluates the influence of routine soil index properties on the prediction of optimum moisture content (w<sub>opt</sub>) and maximum dry unit weight (γ<sub>dmax</sub>), which are the primary outcomes of the Proctor compaction test, using machine learning (ML) methods. A curated database of fine-grained soils (n = 465, drawn from 15 sources) included gravel content (GC), sand content (SC), fines content (FC), liquid limit (LL), plastic limit (PL), plasticity index (PI), specific gravity (Gs), w<sub>opt</sub>, and γ<sub>dmax</sub>. After correlation-based feature filtering, three models were developed: Generalized Additive Model (GAM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The training used nested cross-validation with Bayesian optimization, corresponding to an overall 80–20 train-test split. The model performance was evaluated using R<sup>2</sup>, RMSE, MAE, MAPE, r, and the overfitting ratio calculated for the test set. For w<sub>opt</sub>, the best GAM model achieved R<sup>2</sup> = 0.84 and RMSE = 2.16%, outperforming RF and XGBoost. For γ<sub>dmax</sub>, the best GAM and XGBoost models reached R<sup>2</sup> = 0.79 and RMSE = 0.76 kN/m<sup>3</sup>, respectively. SHapley Additive exPlanations (SHAP), model-based importance scores, and single ablation analyses consistently identified LL and PL as the most influential predictors, and FC provided secondary contributions, while GC and Gs added little once LL and PL had been included. Moreover, paired-feature ablation confirmed the joint influence of LL and PL on the prediction. Overall, all three models predicted compaction parameters with good accuracy; however, GAM models achieved comparable or better predictive metric values than the ensembles (RF and XGBoost) while offering interpretability through plots linking soil indices with the predicted outcomes. This balance of accuracy and interpretability supports GAM as the preferred model for prediction modeling.</div></div>\",\"PeriodicalId\":21857,\"journal\":{\"name\":\"Soils and Foundations\",\"volume\":\"66 2\",\"pages\":\"Article 101728\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2026-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soils and Foundations\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0038080625001623\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2026/1/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, GEOLOGICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soils and Foundations","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0038080625001623","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/21 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ENGINEERING, GEOLOGICAL","Score":null,"Total":0}
Cross-model feature-importance analysis of soil properties for predicting optimum moisture content and maximum dry unit weight of fine-grained soils
This study evaluates the influence of routine soil index properties on the prediction of optimum moisture content (wopt) and maximum dry unit weight (γdmax), which are the primary outcomes of the Proctor compaction test, using machine learning (ML) methods. A curated database of fine-grained soils (n = 465, drawn from 15 sources) included gravel content (GC), sand content (SC), fines content (FC), liquid limit (LL), plastic limit (PL), plasticity index (PI), specific gravity (Gs), wopt, and γdmax. After correlation-based feature filtering, three models were developed: Generalized Additive Model (GAM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The training used nested cross-validation with Bayesian optimization, corresponding to an overall 80–20 train-test split. The model performance was evaluated using R2, RMSE, MAE, MAPE, r, and the overfitting ratio calculated for the test set. For wopt, the best GAM model achieved R2 = 0.84 and RMSE = 2.16%, outperforming RF and XGBoost. For γdmax, the best GAM and XGBoost models reached R2 = 0.79 and RMSE = 0.76 kN/m3, respectively. SHapley Additive exPlanations (SHAP), model-based importance scores, and single ablation analyses consistently identified LL and PL as the most influential predictors, and FC provided secondary contributions, while GC and Gs added little once LL and PL had been included. Moreover, paired-feature ablation confirmed the joint influence of LL and PL on the prediction. Overall, all three models predicted compaction parameters with good accuracy; however, GAM models achieved comparable or better predictive metric values than the ensembles (RF and XGBoost) while offering interpretability through plots linking soil indices with the predicted outcomes. This balance of accuracy and interpretability supports GAM as the preferred model for prediction modeling.
期刊介绍:
Soils and Foundations is one of the leading journals in the field of soil mechanics and geotechnical engineering. It is the official journal of the Japanese Geotechnical Society (JGS)., The journal publishes a variety of original research paper, technical reports, technical notes, as well as the state-of-the-art reports upon invitation by the Editor, in the fields of soil and rock mechanics, geotechnical engineering, and environmental geotechnics. Since the publication of Volume 1, No.1 issue in June 1960, Soils and Foundations will celebrate the 60th anniversary in the year of 2020.
Soils and Foundations welcomes theoretical as well as practical work associated with the aforementioned field(s). Case studies that describe the original and interdisciplinary work applicable to geotechnical engineering are particularly encouraged. Discussions to each of the published articles are also welcomed in order to provide an avenue in which opinions of peers may be fed back or exchanged. In providing latest expertise on a specific topic, one issue out of six per year on average was allocated to include selected papers from the International Symposia which were held in Japan as well as overseas.