Porosity estimation using machine learning approaches for shale reservoirs: A case study of the Lianggaoshan Formation, Sichuan Basin, Western China

IF 2.2 3区 地球科学 Q2 GEOSCIENCES, MULTIDISCIPLINARY
Roufeida Bennani , Min Wang , Xin Wang , Tianyi Li
{"title":"Porosity estimation using machine learning approaches for shale reservoirs: A case study of the Lianggaoshan Formation, Sichuan Basin, Western China","authors":"Roufeida Bennani ,&nbsp;Min Wang ,&nbsp;Xin Wang ,&nbsp;Tianyi Li","doi":"10.1016/j.jappgeo.2025.105702","DOIUrl":null,"url":null,"abstract":"<div><div>Shale porosity is a key petrophysical property that controls the production of hydrocarbons in shale reserves. Accurate determination of this parameter in such formations is challenging due to the complex pore structures, diverse mineral compositions, and high organic content, which complicate the establishment of a physical relationship between reservoir properties and logging data. This study addresses these challenges by developing machine learning models to estimate shale porosity logs using core and well-logging data. Three supervised machine learning algorithms were employed: support vector regressor, multilayer perceptron, and random forest with different ranges of data proportions. These models were evaluated using the correlation coefficient and root mean square error (RMSE) scores for both training and testing datasets. Among these, the random forest model demonstrated its effectiveness by combining predictions from multiple decision trees and handling nonlinear relationships within the input data. It required minimal preprocessing and parameter tuning, enabling accurate shale porosity predictions, with a high data correlation of 93.8 % and a low RMSE of 0.206. These results confirmed the model's suitability for managing limited and complex datasets. In contrast, the multilayer perceptron and support vector regressor were more sensitive to hyperparameter configurations and prone to overfitting. These limitations resulted in reduced accuracy and weaker correlation values compared to the random forest model.</div><div>In addition, a randomization process was introduced during the training phase with an accurate data proportion, to assess the model's reliability and minimize overfitting. The results indicated that this process had no significant impact on data performance, confirming its effectiveness in ensuring data accuracy.</div></div>","PeriodicalId":54882,"journal":{"name":"Journal of Applied Geophysics","volume":"237 ","pages":"Article 105702"},"PeriodicalIF":2.2000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Geophysics","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926985125000837","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Shale porosity is a key petrophysical property that controls the production of hydrocarbons in shale reserves. Accurate determination of this parameter in such formations is challenging due to the complex pore structures, diverse mineral compositions, and high organic content, which complicate the establishment of a physical relationship between reservoir properties and logging data. This study addresses these challenges by developing machine learning models to estimate shale porosity logs using core and well-logging data. Three supervised machine learning algorithms were employed: support vector regressor, multilayer perceptron, and random forest with different ranges of data proportions. These models were evaluated using the correlation coefficient and root mean square error (RMSE) scores for both training and testing datasets. Among these, the random forest model demonstrated its effectiveness by combining predictions from multiple decision trees and handling nonlinear relationships within the input data. It required minimal preprocessing and parameter tuning, enabling accurate shale porosity predictions, with a high data correlation of 93.8 % and a low RMSE of 0.206. These results confirmed the model's suitability for managing limited and complex datasets. In contrast, the multilayer perceptron and support vector regressor were more sensitive to hyperparameter configurations and prone to overfitting. These limitations resulted in reduced accuracy and weaker correlation values compared to the random forest model.
In addition, a randomization process was introduced during the training phase with an accurate data proportion, to assess the model's reliability and minimize overfitting. The results indicated that this process had no significant impact on data performance, confirming its effectiveness in ensuring data accuracy.
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Applied Geophysics
Journal of Applied Geophysics 地学-地球科学综合
CiteScore
3.60
自引率
10.00%
发文量
274
审稿时长
4 months
期刊介绍: The Journal of Applied Geophysics with its key objective of responding to pertinent and timely needs, places particular emphasis on methodological developments and innovative applications of geophysical techniques for addressing environmental, engineering, and hydrological problems. Related topical research in exploration geophysics and in soil and rock physics is also covered by the Journal of Applied Geophysics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信