{"title":"Site-specific soil water characteristic curve prediction with extremely scarce data using data-driven hierarchical Bayesian model","authors":"Menglu Huang, Shin-Ichi Nishimura, Toshifumi Shibata, Linghao Huang, Shiying Zheng","doi":"10.1016/j.trgeo.2025.101527","DOIUrl":null,"url":null,"abstract":"<div><div>The soil-water characteristic curve (SWCC) is fundamental for understanding the hydro-mechanical behavior of unsaturated soils and is widely applied in various fields. However, determining the SWCC through laboratory experiments is time-consuming. As a result, developing efficient prediction models for SWCC is highly valuable for timely decision-making. Existing methods face fundamental limitations: Bayesian approaches rely on predefined empirical models that may fail to fully capture soil–water interactions, while current data-driven machine learning models struggle to handle extremely sparse measurements, incomplete inputs and uncertainty quantification. To address these challenges, this study introduces a data-driven hierarchical Bayesian model (HBM) that integrates an indirect database with extremely sparse site-specific measurements (e.g., fewer than four data points) to reliably predict the SWCC. The HBM operates in two stages where hyperparameters are first estimated from the database to establish a prior model, and then in the inference stage, the prior model is refined through transfer learning to generate a quasi-site-specific posterior model. Through conjugate priors and Gibbs sampling, this approach enables robust SWCC predictions with severely limited data and/or incomplete soil parameters. A comprehensive drying SWCC database with ten essential soil parameters is compiled to train and validate the model through three case studies and leave-one-site-out cross-validation. The results show that the HBM outperforms widely used machine learning models, such as Artificial Neural Networks and Extreme Gradient Boosting, offering a robust solution for SWCC prediction under site-specific data constraints.</div></div>","PeriodicalId":56013,"journal":{"name":"Transportation Geotechnics","volume":"51 ","pages":"Article 101527"},"PeriodicalIF":4.9000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Geotechnics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214391225000467","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
The soil-water characteristic curve (SWCC) is fundamental for understanding the hydro-mechanical behavior of unsaturated soils and is widely applied in various fields. However, determining the SWCC through laboratory experiments is time-consuming. As a result, developing efficient prediction models for SWCC is highly valuable for timely decision-making. Existing methods face fundamental limitations: Bayesian approaches rely on predefined empirical models that may fail to fully capture soil–water interactions, while current data-driven machine learning models struggle to handle extremely sparse measurements, incomplete inputs and uncertainty quantification. To address these challenges, this study introduces a data-driven hierarchical Bayesian model (HBM) that integrates an indirect database with extremely sparse site-specific measurements (e.g., fewer than four data points) to reliably predict the SWCC. The HBM operates in two stages where hyperparameters are first estimated from the database to establish a prior model, and then in the inference stage, the prior model is refined through transfer learning to generate a quasi-site-specific posterior model. Through conjugate priors and Gibbs sampling, this approach enables robust SWCC predictions with severely limited data and/or incomplete soil parameters. A comprehensive drying SWCC database with ten essential soil parameters is compiled to train and validate the model through three case studies and leave-one-site-out cross-validation. The results show that the HBM outperforms widely used machine learning models, such as Artificial Neural Networks and Extreme Gradient Boosting, offering a robust solution for SWCC prediction under site-specific data constraints.
期刊介绍:
Transportation Geotechnics is a journal dedicated to publishing high-quality, theoretical, and applied papers that cover all facets of geotechnics for transportation infrastructure such as roads, highways, railways, underground railways, airfields, and waterways. The journal places a special emphasis on case studies that present original work relevant to the sustainable construction of transportation infrastructure. The scope of topics it addresses includes the geotechnical properties of geomaterials for sustainable and rational design and construction, the behavior of compacted and stabilized geomaterials, the use of geosynthetics and reinforcement in constructed layers and interlayers, ground improvement and slope stability for transportation infrastructures, compaction technology and management, maintenance technology, the impact of climate, embankments for highways and high-speed trains, transition zones, dredging, underwater geotechnics for infrastructure purposes, and the modeling of multi-layered structures and supporting ground under dynamic and repeated loads.