A data-driven PCA-RF-VIM method to identify key factors driving post-fracturing gas production of tight reservoirs

Yifan Zhao , Xiaofan Li , Lei Zuo , Zhongtai Hu , Liangbin Dou , Huagui Yu , Tiantai Li , Jun Lu
{"title":"A data-driven PCA-RF-VIM method to identify key factors driving post-fracturing gas production of tight reservoirs","authors":"Yifan Zhao ,&nbsp;Xiaofan Li ,&nbsp;Lei Zuo ,&nbsp;Zhongtai Hu ,&nbsp;Liangbin Dou ,&nbsp;Huagui Yu ,&nbsp;Tiantai Li ,&nbsp;Jun Lu","doi":"10.1016/j.engeos.2025.100411","DOIUrl":null,"url":null,"abstract":"<div><div>Hydraulic fracturing technology has achieved remarkable results in improving the production of tight gas reservoirs, but its effectiveness is under the joint action of multiple factors of complexity. Traditional analysis methods have limitations in dealing with these complex and interrelated factors, and it is difficult to fully reveal the actual contribution of each factor to the production. Machine learning-based methods explore the complex mapping relationships between large amounts of data to provide data-driven insights into the key factors driving production. In this study, a data-driven PCA-RF-VIM (Principal Component Analysis-Random Forest-Variable Importance Measures) approach of analyzing the importance of features is proposed to identify the key factors driving post-fracturing production. Four types of parameters, including log parameters, geological and reservoir physical parameters, hydraulic fracturing design parameters, and reservoir stimulation parameters, were inputted into the PCA-RF-VIM model. The model was trained using 6-fold cross-validation and grid search, and the relative importance ranking of each factor was finally obtained. In order to verify the validity of the PCA-RF-VIM model, a consolidation model that uses three other independent data-driven methods (Pearson correlation coefficient, RF feature significance analysis method, and XGboost feature significance analysis method) are applied to compare with the PCA-RF-VIM model. A comparison the two models shows that they contain almost the same parameters in the top ten, with only minor differences in one parameter. In combination with the reservoir characteristics, the reasonableness of the PCA-RF-VIM model is verified, and the importance ranking of the parameters by this method is more consistent with the reservoir characteristics of the study area. Ultimately, the ten parameters are selected as the controlling factors that have the potential to influence post-fracturing gas production, as the combined importance of these top ten parameters is 91.95 % on driving natural gas production. Analyzing and obtaining these ten controlling factors provides engineers with a new insight into the reservoir selection for fracturing stimulation and fracturing parameter optimization to improve fracturing efficiency and productivity.</div></div>","PeriodicalId":100469,"journal":{"name":"Energy Geoscience","volume":"6 2","pages":"Article 100411"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Geoscience","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666759225000320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Hydraulic fracturing technology has achieved remarkable results in improving the production of tight gas reservoirs, but its effectiveness is under the joint action of multiple factors of complexity. Traditional analysis methods have limitations in dealing with these complex and interrelated factors, and it is difficult to fully reveal the actual contribution of each factor to the production. Machine learning-based methods explore the complex mapping relationships between large amounts of data to provide data-driven insights into the key factors driving production. In this study, a data-driven PCA-RF-VIM (Principal Component Analysis-Random Forest-Variable Importance Measures) approach of analyzing the importance of features is proposed to identify the key factors driving post-fracturing production. Four types of parameters, including log parameters, geological and reservoir physical parameters, hydraulic fracturing design parameters, and reservoir stimulation parameters, were inputted into the PCA-RF-VIM model. The model was trained using 6-fold cross-validation and grid search, and the relative importance ranking of each factor was finally obtained. In order to verify the validity of the PCA-RF-VIM model, a consolidation model that uses three other independent data-driven methods (Pearson correlation coefficient, RF feature significance analysis method, and XGboost feature significance analysis method) are applied to compare with the PCA-RF-VIM model. A comparison the two models shows that they contain almost the same parameters in the top ten, with only minor differences in one parameter. In combination with the reservoir characteristics, the reasonableness of the PCA-RF-VIM model is verified, and the importance ranking of the parameters by this method is more consistent with the reservoir characteristics of the study area. Ultimately, the ten parameters are selected as the controlling factors that have the potential to influence post-fracturing gas production, as the combined importance of these top ten parameters is 91.95 % on driving natural gas production. Analyzing and obtaining these ten controlling factors provides engineers with a new insight into the reservoir selection for fracturing stimulation and fracturing parameter optimization to improve fracturing efficiency and productivity.

Abstract Image

一种数据驱动的PCA-RF-VIM方法,用于识别致密储层压裂后产气的关键因素
水力压裂技术在提高致密气藏产量方面取得了显著成效,但其有效性是在多种复杂因素的共同作用下产生的。传统的分析方法在处理这些相互关联的复杂因素时存在局限性,难以全面揭示每个因素对产量的实际贡献。基于机器学习的方法可以探索大量数据之间的复杂映射关系,从而以数据为驱动深入了解推动生产的关键因素。本研究提出了一种数据驱动的 PCA-RF-VIM(主成分分析-随机森林-变量重要性度量)方法,通过分析特征的重要性来识别驱动压裂后生产的关键因素。在 PCA-RF-VIM 模型中输入了四类参数,包括测井参数、地质和储层物理参数、水力压裂设计参数和储层刺激参数。利用 6 倍交叉验证和网格搜索对模型进行了训练,最终得到了各因素的相对重要性排序。为了验证 PCA-RF-VIM 模型的有效性,还使用了其他三种独立的数据驱动方法(皮尔逊相关系数、RF 特征重要性分析方法和 XGboost 特征重要性分析方法)建立了一个合并模型,与 PCA-RF-VIM 模型进行比较。比较结果表明,两个模型的前十个参数几乎相同,只有一个参数略有不同。结合油藏特征,PCA-RF-VIM 模型的合理性得到了验证,该方法对参数重要性的排序更符合研究区的油藏特征。最终,这十个参数对压裂后天然气产量的重要程度之和达到 91.95%,因此这十个参数被选为对压裂后天然气产量有潜在影响的控制因素。通过分析和获取这十大控制因素,工程师们对压裂激励的储层选择和压裂参数优化有了新的认识,从而提高了压裂效率和生产率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信