Babak Jamhiri, Yongfu Xu, Mahdi Shadabfar, Susanga Costa
{"title":"Probabilistic machine learning for predicting desiccation cracks in clayey soils","authors":"Babak Jamhiri, Yongfu Xu, Mahdi Shadabfar, Susanga Costa","doi":"10.1007/s10064-023-03366-2","DOIUrl":null,"url":null,"abstract":"<div><p>With frequent heatwaves and drought-downpour cycles, climate change gives rise to severe desiccation cracks. In this research, a probabilistic machine learning (ML) framework is developed to improve the deterministic models. Therefore, a complete set of data-driven soil and environment parameters, including initial water content (IWC), crack water content (CWC), final water content (FWC), soil layer thickness (SLT), temperature (Temp), and relative humidity (RH), is utilized as inputs to predict the crack surface ratio (CSR). Also, a comprehensive set of MLs, including an ensemble of regression trees (i.e., random forests [RF] and regression trees [RT]), gradient-boosted trees (viz. GBT and XGBT), support-vector machines (SVM), and artificial neural network-particle swarm optimization (ANN-PSO), is developed for predictions. Monte Carlo simulation (MCS) is then employed to insert uncertainties in the given models via shuffling and randomizing samples. Two sensitivity analyses, in particular input exclusion and partial dependence-individual conditional expectation plots, are further established to assess the prediction reliability. Results indicate that the performance ranking of developed MLs can be put as SVM > GBT > XGBT > ANN-PSO > RF > RT. However, according to the probabilistic modeling based on the MCS, GBTs are highly capable for predictions with the lowest errors and uncertainties. The performance order of the models in terms of the higher coefficient of determination and lower standard deviation is GBT > SVM > XGBT > RF > ANN-PSO > RT. The sensitivity analyses also categorized the parameter importance in the order of FWC > CWC > SLT > IWC > Temp > RH. These findings demonstrate the immense capabilities of probabilistic MLs under uncertainties by measuring prediction error variances and hence improving performance precision.</p></div>","PeriodicalId":500,"journal":{"name":"Bulletin of Engineering Geology and the Environment","volume":"82 9","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Engineering Geology and the Environment","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s10064-023-03366-2","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
With frequent heatwaves and drought-downpour cycles, climate change gives rise to severe desiccation cracks. In this research, a probabilistic machine learning (ML) framework is developed to improve the deterministic models. Therefore, a complete set of data-driven soil and environment parameters, including initial water content (IWC), crack water content (CWC), final water content (FWC), soil layer thickness (SLT), temperature (Temp), and relative humidity (RH), is utilized as inputs to predict the crack surface ratio (CSR). Also, a comprehensive set of MLs, including an ensemble of regression trees (i.e., random forests [RF] and regression trees [RT]), gradient-boosted trees (viz. GBT and XGBT), support-vector machines (SVM), and artificial neural network-particle swarm optimization (ANN-PSO), is developed for predictions. Monte Carlo simulation (MCS) is then employed to insert uncertainties in the given models via shuffling and randomizing samples. Two sensitivity analyses, in particular input exclusion and partial dependence-individual conditional expectation plots, are further established to assess the prediction reliability. Results indicate that the performance ranking of developed MLs can be put as SVM > GBT > XGBT > ANN-PSO > RF > RT. However, according to the probabilistic modeling based on the MCS, GBTs are highly capable for predictions with the lowest errors and uncertainties. The performance order of the models in terms of the higher coefficient of determination and lower standard deviation is GBT > SVM > XGBT > RF > ANN-PSO > RT. The sensitivity analyses also categorized the parameter importance in the order of FWC > CWC > SLT > IWC > Temp > RH. These findings demonstrate the immense capabilities of probabilistic MLs under uncertainties by measuring prediction error variances and hence improving performance precision.
期刊介绍:
Engineering geology is defined in the statutes of the IAEG as the science devoted to the investigation, study and solution of engineering and environmental problems which may arise as the result of the interaction between geology and the works or activities of man, as well as of the prediction of and development of measures for the prevention or remediation of geological hazards. Engineering geology embraces:
• the applications/implications of the geomorphology, structural geology, and hydrogeological conditions of geological formations;
• the characterisation of the mineralogical, physico-geomechanical, chemical and hydraulic properties of all earth materials involved in construction, resource recovery and environmental change;
• the assessment of the mechanical and hydrological behaviour of soil and rock masses;
• the prediction of changes to the above properties with time;
• the determination of the parameters to be considered in the stability analysis of engineering works and earth masses.