E.U. Eyo , S.J. Abbey , T.T. Lawrence , F.K. Tetteh
{"title":"Improved prediction of clay soil expansion using machine learning algorithms and meta-heuristic dichotomous ensemble classifiers","authors":"E.U. Eyo , S.J. Abbey , T.T. Lawrence , F.K. Tetteh","doi":"10.1016/j.gsf.2021.101296","DOIUrl":null,"url":null,"abstract":"<div><p>Soil swelling-related disaster is considered as one of the most devastating geo-hazards in modern history. Hence, proper determination of a soil’s ability to expand is very vital for achieving a secure and safe ground for infrastructures. Accordingly, this study has provided a novel and intelligent approach that enables an improved estimation of swelling by using kernelised machines (Bayesian linear regression (BLR) & bayes point machine (BPM) support vector machine (SVM) and deep-support vector machine (D-SVM)); (multiple linear regressor (REG), logistic regressor (LR) and artificial neural network (ANN)), tree-based algorithms such as decision forest (RDF) & boosted trees (BDT). Also, and for the first time, meta-heuristic classifiers incorporating the techniques of voting (VE) and stacking (SE) were utilised. Different independent scenarios of explanatory features’ combination that influence soil behaviour in swelling were investigated. Preliminary results indicated BLR as possessing the highest amount of deviation from the predictor variable (the actual swell-strain). REG and BLR performed slightly better than ANN while the meta-heuristic learners (VE and SE) produced the best overall performance (greatest R<sup>2</sup> value of 0.94 and RMSE of 0.06% exhibited by VE). CEC, plasticity index and moisture content were the features considered to have the highest level of importance. Kernelized binary classifiers (SVM, D-SVM and BPM) gave better accuracy (average accuracy and recall rate of 0.93 and 0.60) compared to ANN, LR and RDF. Sensitivity-driven diagnostic test indicated that the meta-heuristic models’ best performance occurred when ML training was conducted using <em>k</em>-fold validation technique. Finally, it is recommended that the concepts developed herein be deployed during the preliminary phases of a geotechnical or geological site characterisation by using the best performing meta-heuristic models via their background coding resource.</p></div>","PeriodicalId":12711,"journal":{"name":"Geoscience frontiers","volume":"13 1","pages":"Article 101296"},"PeriodicalIF":8.5000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1674987121001602/pdfft?md5=8abc27135b52fe4ba9b2f774a9855d7e&pid=1-s2.0-S1674987121001602-main.pdf","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoscience frontiers","FirstCategoryId":"1089","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1674987121001602","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 14
Abstract
Soil swelling-related disaster is considered as one of the most devastating geo-hazards in modern history. Hence, proper determination of a soil’s ability to expand is very vital for achieving a secure and safe ground for infrastructures. Accordingly, this study has provided a novel and intelligent approach that enables an improved estimation of swelling by using kernelised machines (Bayesian linear regression (BLR) & bayes point machine (BPM) support vector machine (SVM) and deep-support vector machine (D-SVM)); (multiple linear regressor (REG), logistic regressor (LR) and artificial neural network (ANN)), tree-based algorithms such as decision forest (RDF) & boosted trees (BDT). Also, and for the first time, meta-heuristic classifiers incorporating the techniques of voting (VE) and stacking (SE) were utilised. Different independent scenarios of explanatory features’ combination that influence soil behaviour in swelling were investigated. Preliminary results indicated BLR as possessing the highest amount of deviation from the predictor variable (the actual swell-strain). REG and BLR performed slightly better than ANN while the meta-heuristic learners (VE and SE) produced the best overall performance (greatest R2 value of 0.94 and RMSE of 0.06% exhibited by VE). CEC, plasticity index and moisture content were the features considered to have the highest level of importance. Kernelized binary classifiers (SVM, D-SVM and BPM) gave better accuracy (average accuracy and recall rate of 0.93 and 0.60) compared to ANN, LR and RDF. Sensitivity-driven diagnostic test indicated that the meta-heuristic models’ best performance occurred when ML training was conducted using k-fold validation technique. Finally, it is recommended that the concepts developed herein be deployed during the preliminary phases of a geotechnical or geological site characterisation by using the best performing meta-heuristic models via their background coding resource.
Geoscience frontiersEarth and Planetary Sciences-General Earth and Planetary Sciences
CiteScore
17.80
自引率
3.40%
发文量
147
审稿时长
35 days
期刊介绍:
Geoscience Frontiers (GSF) is the Journal of China University of Geosciences (Beijing) and Peking University. It publishes peer-reviewed research articles and reviews in interdisciplinary fields of Earth and Planetary Sciences. GSF covers various research areas including petrology and geochemistry, lithospheric architecture and mantle dynamics, global tectonics, economic geology and fuel exploration, geophysics, stratigraphy and paleontology, environmental and engineering geology, astrogeology, and the nexus of resources-energy-emissions-climate under Sustainable Development Goals. The journal aims to bridge innovative, provocative, and challenging concepts and models in these fields, providing insights on correlations and evolution.