基于数据扩充的页岩储层TOC可解释经验驱动混合预测模型

IF 2.1 3区 地球科学 Q2 GEOSCIENCES, MULTIDISCIPLINARY
Yuzhen Hong , Shaogui Deng , Zhijun Li , Yueqin Guo , Zhoutuo Wei
{"title":"基于数据扩充的页岩储层TOC可解释经验驱动混合预测模型","authors":"Yuzhen Hong ,&nbsp;Shaogui Deng ,&nbsp;Zhijun Li ,&nbsp;Yueqin Guo ,&nbsp;Zhoutuo Wei","doi":"10.1016/j.jappgeo.2025.105977","DOIUrl":null,"url":null,"abstract":"<div><div>Total Organic Carbon (TOC) content is a measure of the carbon content in organic compounds, commonly used as a critical indicator for assessing unconventional shale resources. Therefore, an accurate TOC prediction model can help evaluate the reservoir's hydrocarbon potential at a low cost and improve the development efficiency. However, the sparsity of experimental data and the high heterogeneity of reservoirs present challenges for TOC prediction. This study proposes combining data enhancement techniques and expert experience-driven machine learning models for accurate TOC prediction in complex shale reservoirs. Firstly, we propose a set of data enhancement methods to address the problems of weak logging response and insufficient TOC experimental data. We enrich the training dataset by introducing reconstruction curves to visualize the response and designing Generative Adversarial Network (GAN) simulations to generate high-quality data. In the experience-driven model construction, we optimized the traditional ΔlogR method by integrating expert knowledge and a detailed analysis of the physical properties of shale reservoirs. We proposed a density-gamma modified ΔlogR method as the core of the experience-driven approach. Furthermore, we integrated the empirical formula into the fitness function of the Grey Wolf Optimizer (GWO). We combined it with a Support Vector Regression (SVR) model to build a hybrid model. The hybrid method was tested in the Dongying Depression. The R<sup>2</sup> values for wells A and B were 0.95 and 0.97, with Root Mean Square Error (RMSE) values of 0.31 and 0.29, and Mean Absolute Error (MAE) values below 0.3. The prediction results demonstrated significant improvement over any single method. We also analyzed the correlation between well logging curves and prediction results using the SHapley Additive exPlanations (SHAP) method. By revealing the decision-making mechanism within the model, we verified the reasonableness of the experience-driven and enhanced the model's credibility.</div></div>","PeriodicalId":54882,"journal":{"name":"Journal of Applied Geophysics","volume":"243 ","pages":"Article 105977"},"PeriodicalIF":2.1000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An explainable experience-driven hybrid model for TOC prediction in shale reservoirs based on data augmentation\",\"authors\":\"Yuzhen Hong ,&nbsp;Shaogui Deng ,&nbsp;Zhijun Li ,&nbsp;Yueqin Guo ,&nbsp;Zhoutuo Wei\",\"doi\":\"10.1016/j.jappgeo.2025.105977\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Total Organic Carbon (TOC) content is a measure of the carbon content in organic compounds, commonly used as a critical indicator for assessing unconventional shale resources. Therefore, an accurate TOC prediction model can help evaluate the reservoir's hydrocarbon potential at a low cost and improve the development efficiency. However, the sparsity of experimental data and the high heterogeneity of reservoirs present challenges for TOC prediction. This study proposes combining data enhancement techniques and expert experience-driven machine learning models for accurate TOC prediction in complex shale reservoirs. Firstly, we propose a set of data enhancement methods to address the problems of weak logging response and insufficient TOC experimental data. We enrich the training dataset by introducing reconstruction curves to visualize the response and designing Generative Adversarial Network (GAN) simulations to generate high-quality data. In the experience-driven model construction, we optimized the traditional ΔlogR method by integrating expert knowledge and a detailed analysis of the physical properties of shale reservoirs. We proposed a density-gamma modified ΔlogR method as the core of the experience-driven approach. Furthermore, we integrated the empirical formula into the fitness function of the Grey Wolf Optimizer (GWO). We combined it with a Support Vector Regression (SVR) model to build a hybrid model. The hybrid method was tested in the Dongying Depression. The R<sup>2</sup> values for wells A and B were 0.95 and 0.97, with Root Mean Square Error (RMSE) values of 0.31 and 0.29, and Mean Absolute Error (MAE) values below 0.3. The prediction results demonstrated significant improvement over any single method. We also analyzed the correlation between well logging curves and prediction results using the SHapley Additive exPlanations (SHAP) method. By revealing the decision-making mechanism within the model, we verified the reasonableness of the experience-driven and enhanced the model's credibility.</div></div>\",\"PeriodicalId\":54882,\"journal\":{\"name\":\"Journal of Applied Geophysics\",\"volume\":\"243 \",\"pages\":\"Article 105977\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Geophysics\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0926985125003581\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Geophysics","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926985125003581","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

总有机碳(TOC)含量是有机化合物中碳含量的度量,通常被用作评估非常规页岩资源的关键指标。因此,建立准确的TOC预测模型有助于低成本评价储层的含油气潜力,提高开发效率。然而,实验数据的稀疏性和储层的高非均质性给TOC预测带来了挑战。该研究提出将数据增强技术与专家经验驱动的机器学习模型相结合,以准确预测复杂页岩储层的TOC。首先,针对测井响应弱、TOC实验数据不足等问题,提出了一套数据增强方法。我们通过引入重建曲线来丰富训练数据集来可视化响应,并设计生成对抗网络(GAN)模拟来生成高质量的数据。在经验驱动模型构建中,通过整合专家知识和对页岩储层物性的详细分析,对传统的ΔlogR方法进行了优化。我们提出了一种密度-伽马修正ΔlogR方法作为经验驱动方法的核心。并将经验公式整合到灰狼优化器(GWO)的适应度函数中。我们将其与支持向量回归(SVR)模型相结合,构建了一个混合模型。该方法在东营凹陷进行了试验。A井和B井的R2分别为0.95和0.97,均方根误差(RMSE)分别为0.31和0.29,平均绝对误差(MAE)小于0.3。预测结果比任何单一方法都有显著改善。利用SHapley加性解释(SHAP)方法分析了测井曲线与预测结果的相关性。通过揭示模型内部的决策机制,验证了经验驱动的合理性,增强了模型的可信度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An explainable experience-driven hybrid model for TOC prediction in shale reservoirs based on data augmentation
Total Organic Carbon (TOC) content is a measure of the carbon content in organic compounds, commonly used as a critical indicator for assessing unconventional shale resources. Therefore, an accurate TOC prediction model can help evaluate the reservoir's hydrocarbon potential at a low cost and improve the development efficiency. However, the sparsity of experimental data and the high heterogeneity of reservoirs present challenges for TOC prediction. This study proposes combining data enhancement techniques and expert experience-driven machine learning models for accurate TOC prediction in complex shale reservoirs. Firstly, we propose a set of data enhancement methods to address the problems of weak logging response and insufficient TOC experimental data. We enrich the training dataset by introducing reconstruction curves to visualize the response and designing Generative Adversarial Network (GAN) simulations to generate high-quality data. In the experience-driven model construction, we optimized the traditional ΔlogR method by integrating expert knowledge and a detailed analysis of the physical properties of shale reservoirs. We proposed a density-gamma modified ΔlogR method as the core of the experience-driven approach. Furthermore, we integrated the empirical formula into the fitness function of the Grey Wolf Optimizer (GWO). We combined it with a Support Vector Regression (SVR) model to build a hybrid model. The hybrid method was tested in the Dongying Depression. The R2 values for wells A and B were 0.95 and 0.97, with Root Mean Square Error (RMSE) values of 0.31 and 0.29, and Mean Absolute Error (MAE) values below 0.3. The prediction results demonstrated significant improvement over any single method. We also analyzed the correlation between well logging curves and prediction results using the SHapley Additive exPlanations (SHAP) method. By revealing the decision-making mechanism within the model, we verified the reasonableness of the experience-driven and enhanced the model's credibility.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Applied Geophysics
Journal of Applied Geophysics 地学-地球科学综合
CiteScore
3.60
自引率
10.00%
发文量
274
审稿时长
4 months
期刊介绍: The Journal of Applied Geophysics with its key objective of responding to pertinent and timely needs, places particular emphasis on methodological developments and innovative applications of geophysical techniques for addressing environmental, engineering, and hydrological problems. Related topical research in exploration geophysics and in soil and rock physics is also covered by the Journal of Applied Geophysics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信