Combination models of random forest for predicting seismic liquefaction based on SPT, CPT, Vs databases considering sampling strategies

IF 4.6 2区 工程技术 Q1 ENGINEERING, GEOLOGICAL
Jilei Hu , Lianming Huang , Qi Shao
{"title":"Combination models of random forest for predicting seismic liquefaction based on SPT, CPT, Vs databases considering sampling strategies","authors":"Jilei Hu ,&nbsp;Lianming Huang ,&nbsp;Qi Shao","doi":"10.1016/j.soildyn.2025.109642","DOIUrl":null,"url":null,"abstract":"<div><div>The sampling strategy has an important impact on the accuracy of seismic liquefaction discrimination models. In addition, different models may produce contradictory discriminative results. This paper, based on three in situ experimental data (standard penetration test (SPT), cone penetration test (CPT), and shear wave velocity (V<sub>s</sub>)), adopts the Random Forest (RF) method to analyze, the effects of five probabilistic sampling methods (Simple Random Sampling (SRS), Unordered Systematic Sampling (USS), Ordered Systematic Sampling (OSS), Stratified Random Sampling (StrRS), and Cluster Sampling (CS)) and five integration methods (sequential integration, voting, simple averaging, weighted averaging, and Bayesian model averaging) on the RF models of seismic liquefaction, and constructs three RF model based different in-situ tests data and a Combined RF Model (CRF). The results show that the sampling methods have a large impact on the performance of the RF model. Among them, the OSS method performed the best in different in-situ test databases with Acc = 0.9 and <em>F</em><sub><em>1</em></sub> = 0.930 for the RF-SPT model (the RF model based on the SPT data), Acc = 0.88 and <em>F</em><sub><em>1</em></sub> = 0.918 for the RF-CPT model (the RF model based on the CPT data), Acc = 0.872 and <em>F</em><sub><em>1</em></sub> = 0.913 for the RF-Vs model (the RF model based on the V<sub>s</sub> data), whereas, the CS method performed the worst in the datasets. In addition, sensitivity analysis of the RF models under the optimal sampling method was performed. In combined models, integration modes do not always improve model performance, and sequential integration fails to improve model performance in this study. However, the CRF based on the Bayesian model averaging method performed the best with Acc = 0.924 and <em>F</em><sub><em>1</em></sub> = 0.947, which is better than the RF-SPT model.</div></div>","PeriodicalId":49502,"journal":{"name":"Soil Dynamics and Earthquake Engineering","volume":"198 ","pages":"Article 109642"},"PeriodicalIF":4.6000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soil Dynamics and Earthquake Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S026772612500435X","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, GEOLOGICAL","Score":null,"Total":0}
引用次数: 0

Abstract

The sampling strategy has an important impact on the accuracy of seismic liquefaction discrimination models. In addition, different models may produce contradictory discriminative results. This paper, based on three in situ experimental data (standard penetration test (SPT), cone penetration test (CPT), and shear wave velocity (Vs)), adopts the Random Forest (RF) method to analyze, the effects of five probabilistic sampling methods (Simple Random Sampling (SRS), Unordered Systematic Sampling (USS), Ordered Systematic Sampling (OSS), Stratified Random Sampling (StrRS), and Cluster Sampling (CS)) and five integration methods (sequential integration, voting, simple averaging, weighted averaging, and Bayesian model averaging) on the RF models of seismic liquefaction, and constructs three RF model based different in-situ tests data and a Combined RF Model (CRF). The results show that the sampling methods have a large impact on the performance of the RF model. Among them, the OSS method performed the best in different in-situ test databases with Acc = 0.9 and F1 = 0.930 for the RF-SPT model (the RF model based on the SPT data), Acc = 0.88 and F1 = 0.918 for the RF-CPT model (the RF model based on the CPT data), Acc = 0.872 and F1 = 0.913 for the RF-Vs model (the RF model based on the Vs data), whereas, the CS method performed the worst in the datasets. In addition, sensitivity analysis of the RF models under the optimal sampling method was performed. In combined models, integration modes do not always improve model performance, and sequential integration fails to improve model performance in this study. However, the CRF based on the Bayesian model averaging method performed the best with Acc = 0.924 and F1 = 0.947, which is better than the RF-SPT model.
考虑采样策略的SPT、CPT、v数据库组合随机森林地震液化预测模型
采样策略对地震液化判别模型的精度有重要影响。此外,不同的模型可能会产生相互矛盾的判别结果。本文基于标准贯入试验(SPT)、锥体贯入试验(CPT)和横波速度(Vs) 3个现场实验数据,采用随机森林(RF)方法分析了5种概率抽样方法(简单随机抽样(SRS)、无序系统抽样(USS)、有序系统抽样(OSS)、分层随机抽样(StrRS)和聚类抽样(CS))和5种积分方法(顺序积分、投票、简单平均、对地震液化的射频模型进行加权平均和贝叶斯模型平均,构建了基于不同现场试验数据的三个射频模型和一个组合射频模型(CRF)。结果表明,采样方法对射频模型的性能有很大影响。其中,OSS表现最好的方法在不同的原位测试数据库与Acc = 0.9和F1 = 0.930 RF-SPT模型基于SPT (RF模型数据)、Acc = 0.88和F1 = 0.918 RF-CPT模型(RF模型基于CPT数据)、Acc = 0.872和F1 = 0.913 RF-Vs模型(RF模型基于Vs数据),然而,CS方法表现最糟糕的数据集。此外,还对最优采样方法下的射频模型进行了灵敏度分析。在组合模型中,集成方式并不一定能提高模型性能,顺序集成在本研究中未能提高模型性能。而基于贝叶斯模型平均法的CRF表现最好,Acc = 0.924, F1 = 0.947,优于RF-SPT模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Soil Dynamics and Earthquake Engineering
Soil Dynamics and Earthquake Engineering 工程技术-地球科学综合
CiteScore
7.50
自引率
15.00%
发文量
446
审稿时长
8 months
期刊介绍: The journal aims to encourage and enhance the role of mechanics and other disciplines as they relate to earthquake engineering by providing opportunities for the publication of the work of applied mathematicians, engineers and other applied scientists involved in solving problems closely related to the field of earthquake engineering and geotechnical earthquake engineering. Emphasis is placed on new concepts and techniques, but case histories will also be published if they enhance the presentation and understanding of new technical concepts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信