Xingyang Liu , Degao Zou , Yuan Chen , Huafu Pei , Zhanchao Li , Linsong Sun , Laifu Song
{"title":"利用可解释的机器学习框架和实验室测试数据集分析粒度分布对土壤最大剪切模量的影响","authors":"Xingyang Liu , Degao Zou , Yuan Chen , Huafu Pei , Zhanchao Li , Linsong Sun , Laifu Song","doi":"10.1016/j.soildyn.2024.109031","DOIUrl":null,"url":null,"abstract":"<div><div>The maximum shear modulus (<em>G</em><sub>max</sub>) is a key parameter used to characterize the dynamic properties of soils. In this research, a dataset was systematically collected and constructed through literature review. It comprises 2782 instances of <em>G</em><sub>max</sub> values and their influencing factors for various soil types, aimed at examining the effect of particle size distribution on the <em>G</em><sub>max</sub>. The eXtreme Gradient Boosting (XGBoost) algorithm was employed to develop the predictive model for <em>G</em><sub>max</sub>, followed by the enhancement of model's performance through Bayesian Optimization (BO) algorithm. After comparison with other empirical models, the BO-XGBoost model was selected as the best model. Finally, the prediction of BO-XGBoost was interpreted using the SHapley Additive exPlanations (SHAP) framework in order to overcome the black box problem of traditional machine learning methods. The results suggest that SHAP effectively extracts critical information from the data when data labels are appropriately configured, thereby augmenting the reliability of the prediction outcomes. Globally, the feature importance ranking and the direction of correlations between input features and the output variable align with the prior knowledge. Locally, however, the importance ranking of features for individual samples may deviate from the global trend. Meanwhile, the influence of identical input features can vary across different samples.</div></div>","PeriodicalId":49502,"journal":{"name":"Soil Dynamics and Earthquake Engineering","volume":"188 ","pages":"Article 109031"},"PeriodicalIF":4.2000,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analyzing the influence of particle size distribution on the maximum shear modulus of soil with an interpretable machine learning framework and laboratory test dataset\",\"authors\":\"Xingyang Liu , Degao Zou , Yuan Chen , Huafu Pei , Zhanchao Li , Linsong Sun , Laifu Song\",\"doi\":\"10.1016/j.soildyn.2024.109031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The maximum shear modulus (<em>G</em><sub>max</sub>) is a key parameter used to characterize the dynamic properties of soils. In this research, a dataset was systematically collected and constructed through literature review. It comprises 2782 instances of <em>G</em><sub>max</sub> values and their influencing factors for various soil types, aimed at examining the effect of particle size distribution on the <em>G</em><sub>max</sub>. The eXtreme Gradient Boosting (XGBoost) algorithm was employed to develop the predictive model for <em>G</em><sub>max</sub>, followed by the enhancement of model's performance through Bayesian Optimization (BO) algorithm. After comparison with other empirical models, the BO-XGBoost model was selected as the best model. Finally, the prediction of BO-XGBoost was interpreted using the SHapley Additive exPlanations (SHAP) framework in order to overcome the black box problem of traditional machine learning methods. The results suggest that SHAP effectively extracts critical information from the data when data labels are appropriately configured, thereby augmenting the reliability of the prediction outcomes. Globally, the feature importance ranking and the direction of correlations between input features and the output variable align with the prior knowledge. Locally, however, the importance ranking of features for individual samples may deviate from the global trend. Meanwhile, the influence of identical input features can vary across different samples.</div></div>\",\"PeriodicalId\":49502,\"journal\":{\"name\":\"Soil Dynamics and Earthquake Engineering\",\"volume\":\"188 \",\"pages\":\"Article 109031\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soil Dynamics and Earthquake Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0267726124005839\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, GEOLOGICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soil Dynamics and Earthquake Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0267726124005839","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, GEOLOGICAL","Score":null,"Total":0}
Analyzing the influence of particle size distribution on the maximum shear modulus of soil with an interpretable machine learning framework and laboratory test dataset
The maximum shear modulus (Gmax) is a key parameter used to characterize the dynamic properties of soils. In this research, a dataset was systematically collected and constructed through literature review. It comprises 2782 instances of Gmax values and their influencing factors for various soil types, aimed at examining the effect of particle size distribution on the Gmax. The eXtreme Gradient Boosting (XGBoost) algorithm was employed to develop the predictive model for Gmax, followed by the enhancement of model's performance through Bayesian Optimization (BO) algorithm. After comparison with other empirical models, the BO-XGBoost model was selected as the best model. Finally, the prediction of BO-XGBoost was interpreted using the SHapley Additive exPlanations (SHAP) framework in order to overcome the black box problem of traditional machine learning methods. The results suggest that SHAP effectively extracts critical information from the data when data labels are appropriately configured, thereby augmenting the reliability of the prediction outcomes. Globally, the feature importance ranking and the direction of correlations between input features and the output variable align with the prior knowledge. Locally, however, the importance ranking of features for individual samples may deviate from the global trend. Meanwhile, the influence of identical input features can vary across different samples.
期刊介绍:
The journal aims to encourage and enhance the role of mechanics and other disciplines as they relate to earthquake engineering by providing opportunities for the publication of the work of applied mathematicians, engineers and other applied scientists involved in solving problems closely related to the field of earthquake engineering and geotechnical earthquake engineering.
Emphasis is placed on new concepts and techniques, but case histories will also be published if they enhance the presentation and understanding of new technical concepts.