{"title":"Knowledge Extraction via Machine Learning Guides a Topology-Based Permeability Prediction Model","authors":"Jia Zhang, Gang Ma, Zhibing Yang, Jiangzhou Mei, Daren Zhang, Wei Zhou, Xiaolin Chang","doi":"10.1029/2024wr037124","DOIUrl":null,"url":null,"abstract":"The complexity and heterogeneity of pore structure present significant challenges in accurate permeability estimation. Commonly used empirical formulas neglect its microscopic and topological characteristics, thus lacking accuracy and adaptability. While machine learning (ML) and deep learning (DL) models demonstrate promising performance, but encounter challenges of data availability, computational cost, and model interpretability. The present study aims to develop a more robust and accurate permeability prediction model via knowledge extraction from ML model. We first establish an ML model between permeability and the geometry-topology characteristics of porous media using Extreme Gradient Boosting (XGBoost) algorithm. The data set used to fit ML model is prepared from 458 samples of different types of porous media. Using the SHapley Additive exPlanations (SHAP) value, the influence of each feature on permeability prediction is quantified. It is found that the closeness centrality (topology feature), tortuosity, porosity (macroscopic features) and throat diameter, throat length, pore diameter (pore network features) are vital for permeability prediction. Guided by partial dependence calculation, the unknown function relationship between permeability and the top six important features is established. The novel permeability prediction model incorporating topology feature improves the prediction accuracy and demonstrates strong applicability across diverse data sets. This new model presents an optimal balance between simplicity and performance, rendering it a compelling alternative for permeability prediction in porous media. The research provides a novel referable framework of knowledge extraction via ML to reveal the important features and establish the potential relationship that can be extended and applied in other research fields.","PeriodicalId":23799,"journal":{"name":"Water Resources Research","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1029/2024wr037124","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The complexity and heterogeneity of pore structure present significant challenges in accurate permeability estimation. Commonly used empirical formulas neglect its microscopic and topological characteristics, thus lacking accuracy and adaptability. While machine learning (ML) and deep learning (DL) models demonstrate promising performance, but encounter challenges of data availability, computational cost, and model interpretability. The present study aims to develop a more robust and accurate permeability prediction model via knowledge extraction from ML model. We first establish an ML model between permeability and the geometry-topology characteristics of porous media using Extreme Gradient Boosting (XGBoost) algorithm. The data set used to fit ML model is prepared from 458 samples of different types of porous media. Using the SHapley Additive exPlanations (SHAP) value, the influence of each feature on permeability prediction is quantified. It is found that the closeness centrality (topology feature), tortuosity, porosity (macroscopic features) and throat diameter, throat length, pore diameter (pore network features) are vital for permeability prediction. Guided by partial dependence calculation, the unknown function relationship between permeability and the top six important features is established. The novel permeability prediction model incorporating topology feature improves the prediction accuracy and demonstrates strong applicability across diverse data sets. This new model presents an optimal balance between simplicity and performance, rendering it a compelling alternative for permeability prediction in porous media. The research provides a novel referable framework of knowledge extraction via ML to reveal the important features and establish the potential relationship that can be extended and applied in other research fields.
孔隙结构的复杂性和异质性给准确估算渗透率带来了巨大挑战。常用的经验公式忽略了其微观和拓扑特征,因此缺乏准确性和适应性。虽然机器学习(ML)和深度学习(DL)模型表现出良好的性能,但也遇到了数据可用性、计算成本和模型可解释性等方面的挑战。本研究旨在通过从 ML 模型中提取知识,开发一种更稳健、更准确的渗透率预测模型。我们首先利用极端梯度提升(XGBoost)算法建立了渗透率与多孔介质几何拓扑特征之间的 ML 模型。用于拟合 ML 模型的数据集来自 458 个不同类型的多孔介质样本。利用 SHapley Additive exPlanations(SHAP)值,量化了每个特征对渗透率预测的影响。研究发现,闭合中心性(拓扑特征)、迂回度、孔隙度(宏观特征)和喉管直径、喉管长度、孔隙直径(孔隙网络特征)对渗透率预测至关重要。在部分依存计算的指导下,建立了渗透率与前六个重要特征之间的未知函数关系。包含拓扑特征的新型渗透率预测模型提高了预测精度,并在各种数据集中显示出很强的适用性。这一新模型在简单性和性能之间实现了最佳平衡,使其成为多孔介质渗透性预测的一个令人信服的替代方案。该研究提供了一个通过 ML 提取知识的新颖可参考框架,以揭示重要特征并建立潜在关系,该框架可扩展并应用于其他研究领域。
期刊介绍:
Water Resources Research (WRR) is an interdisciplinary journal that focuses on hydrology and water resources. It publishes original research in the natural and social sciences of water. It emphasizes the role of water in the Earth system, including physical, chemical, biological, and ecological processes in water resources research and management, including social, policy, and public health implications. It encompasses observational, experimental, theoretical, analytical, numerical, and data-driven approaches that advance the science of water and its management. Submissions are evaluated for their novelty, accuracy, significance, and broader implications of the findings.