Uncertainty and explainable analysis of machine learning model for reconstruction of sonic slowness logs

Hua Wang , Yuqiong Wu , Yushun Zhang , Fuqiang Lai , Zhou Feng , Bing Xie , Ailin Zhao
{"title":"Uncertainty and explainable analysis of machine learning model for reconstruction of sonic slowness logs","authors":"Hua Wang ,&nbsp;Yuqiong Wu ,&nbsp;Yushun Zhang ,&nbsp;Fuqiang Lai ,&nbsp;Zhou Feng ,&nbsp;Bing Xie ,&nbsp;Ailin Zhao","doi":"10.1016/j.aiig.2023.11.002","DOIUrl":null,"url":null,"abstract":"<div><p>Logs are valuable information for oil and gas fields as they help to determine the lithology of the formations surrounding the borehole and the location and reserves of subsurface oil and gas reservoirs. However, important logs are often missing in horizontal or old wells, which poses a challenge in field applications. To address this issue, conventional methods involve supplementing the missing logs by either combining geological experience and referring data from nearby boreholes or reconstructing them directly using the remaining logs in the same borehole. Nevertheless, there is currently no quantitative evaluation for the quality and rationality of the constructed log. In this paper, we utilize data from the 2020 machine learning competition of the Society of Petrophysicists and Logging Analysts (SPWLA), which aims to predict the missing compressional wave slowness (DTC) and shear wave slowness (DTS) logs using other logs in the same borehole. We employ the natural gradient boosting (NGBoost) algorithm to construct an Ensemble Learning model that can predicate the results as well as their uncertainty. Furthermore, we combine the SHAP (SHapley Additive exPlanations) method to investigate the interpretability of the machine learning model. We compare the performance of the NGBosst model with four other commonly used Ensemble Learning methods, including Random Forest, GBDT, XGBoost, LightGBM. The results show that the NGBoost model performs well in the testing set and can provide a probability distribution for the prediction results. This distribution allows petrophysicists to quantitatively analyze the confidence interval of the constructed log. In addition, the variance of the probability distribution of the predicted log can be used to justify the quality of the constructed log. Using the SHAP explainable machine learning model, we calculate the importance of each input log to the predicted results as well as the coupling relationship among input logs. Our findings reveal that the NGBoost model tends to provide greater slowness prediction results when the neutron porosity (CNC) and gamma ray (GR) are large, which is consistent with the cognition of petrophysical models. Furthermore, the machine learning model can capture the influence of the changing borehole caliper on slowness, where the influence of borehole caliper on slowness is complex and not easy to establish a direct relationship. These findings are in line with the physical principle of borehole acoustics. Finally, by using the explainable machine learning model, we observe that although we did not correct the effect of borehole caliper on the neutron porosity log through preprocessing, the machine learning model assigned a greater importance to the influence of the caliper, achieving the same effect as caliper correction.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"4 ","pages":"Pages 182-198"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666544123000321/pdfft?md5=ff398734a4ea8a092a89af0a39182690&pid=1-s2.0-S2666544123000321-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666544123000321","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Logs are valuable information for oil and gas fields as they help to determine the lithology of the formations surrounding the borehole and the location and reserves of subsurface oil and gas reservoirs. However, important logs are often missing in horizontal or old wells, which poses a challenge in field applications. To address this issue, conventional methods involve supplementing the missing logs by either combining geological experience and referring data from nearby boreholes or reconstructing them directly using the remaining logs in the same borehole. Nevertheless, there is currently no quantitative evaluation for the quality and rationality of the constructed log. In this paper, we utilize data from the 2020 machine learning competition of the Society of Petrophysicists and Logging Analysts (SPWLA), which aims to predict the missing compressional wave slowness (DTC) and shear wave slowness (DTS) logs using other logs in the same borehole. We employ the natural gradient boosting (NGBoost) algorithm to construct an Ensemble Learning model that can predicate the results as well as their uncertainty. Furthermore, we combine the SHAP (SHapley Additive exPlanations) method to investigate the interpretability of the machine learning model. We compare the performance of the NGBosst model with four other commonly used Ensemble Learning methods, including Random Forest, GBDT, XGBoost, LightGBM. The results show that the NGBoost model performs well in the testing set and can provide a probability distribution for the prediction results. This distribution allows petrophysicists to quantitatively analyze the confidence interval of the constructed log. In addition, the variance of the probability distribution of the predicted log can be used to justify the quality of the constructed log. Using the SHAP explainable machine learning model, we calculate the importance of each input log to the predicted results as well as the coupling relationship among input logs. Our findings reveal that the NGBoost model tends to provide greater slowness prediction results when the neutron porosity (CNC) and gamma ray (GR) are large, which is consistent with the cognition of petrophysical models. Furthermore, the machine learning model can capture the influence of the changing borehole caliper on slowness, where the influence of borehole caliper on slowness is complex and not easy to establish a direct relationship. These findings are in line with the physical principle of borehole acoustics. Finally, by using the explainable machine learning model, we observe that although we did not correct the effect of borehole caliper on the neutron porosity log through preprocessing, the machine learning model assigned a greater importance to the influence of the caliper, achieving the same effect as caliper correction.

声波慢度测井重建机器学习模型的不确定性与可解释性分析
测井资料对于油气田来说是很有价值的信息,因为它们有助于确定井眼周围地层的岩性以及地下油气储层的位置和储量。然而,水平井或老井往往缺少重要的测井曲线,这给现场应用带来了挑战。为了解决这个问题,传统的方法包括通过结合地质经验和参考附近井眼的数据来补充缺失的测井曲线,或者直接使用同一井眼中的剩余测井曲线进行重建。然而,目前还没有对所建原木的质量和合理性进行定量评价。在本文中,我们利用了来自岩石物理学家和测井分析师协会(SPWLA) 2020年机器学习竞赛的数据,该竞赛旨在使用同一井眼中的其他测井数据预测缺失的纵波慢度(DTC)和横波慢度(DTS)测井数据。我们采用自然梯度增强(NGBoost)算法来构建一个集成学习模型,该模型可以预测结果及其不确定性。此外,我们结合SHAP (SHapley Additive exPlanations)方法来研究机器学习模型的可解释性。我们将NGBosst模型与其他四种常用的集成学习方法(包括Random Forest, GBDT, XGBoost, LightGBM)的性能进行了比较。结果表明,NGBoost模型在测试集中表现良好,可以为预测结果提供一个概率分布。这种分布使岩石物理学家能够定量分析构造的测井曲线的置信区间。此外,预测日志的概率分布的方差可以用来证明构造日志的质量。使用SHAP可解释机器学习模型,我们计算了每个输入日志对预测结果的重要性以及输入日志之间的耦合关系。研究结果表明,当中子孔隙度(CNC)和伽马射线(GR)较大时,NGBoost模型的慢度预测结果更佳,这与岩石物理模型的认知一致。此外,机器学习模型可以捕捉井径变化对慢度的影响,其中井径对慢度的影响是复杂的,不容易建立直接关系。这些发现符合钻孔声学的物理原理。最后,通过使用可解释机器学习模型,我们观察到,虽然我们没有通过预处理校正井径器对中子孔隙度测井的影响,但机器学习模型更加重视井径器的影响,达到了与井径器校正相同的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信