A. Voskresenskiy, N. Bukhanov, M. A. Kuntsevich, O. Popova, Alexey S. Goncharov
{"title":"Rock Type Classification Models Interpretability Using Shapley Values","authors":"A. Voskresenskiy, N. Bukhanov, M. A. Kuntsevich, O. Popova, Alexey S. Goncharov","doi":"10.2118/207707-ms","DOIUrl":null,"url":null,"abstract":"\n We propose a methodology to improve rock type classification using machine learning (ML) techniques and to reveal causal inferences between reservoir quality and well log measurements. Rock type classification is an essential step in accurate reservoir modeling and forecasting. Machine learning approaches allow to automate rock type classification based on different well logs and core data. In order to choose the best model which does not progradate uncertainty further into the workflow it is important to interpret machine learning results. Feature importance and feature selection methods are usually employed for that. We propose an extension to existing approaches - model agnostic sensitivity algorithm based on Shapley values.\n The paper describes a full workflow to rock type prediction using well log data: from data preparation, model building, feature selection to causal inference analysis. We made ML models that classify rock types using well logs (sonic, gamma, density, photoelectric and resistivity) from 21 wells as predictors and conduct a causal inference analysis between reservoir quality and well logs responses using Shapley values (a concept from a game theory). As a result of feature selection, we obtained predictors which are statistically significant and at the same time relevant in causal relation context.\n Macro F1-score of the best obtained models for both cases is 0.79 and 0.85 respectively. It was found that the ML models can infer domain knowledge, which allows us to confirm the adequacy of the built ML model for rock types prediction. Our insight was to recognize the need to properly account for the underlying causal structure between the features and rock types in order to derive meaningful and relevant predictors that carry a significant amount of information contributing to the final outcome. Also, we demonstrate the robustness of revealed patterns by applying the Shapley values methodology to a number of ML models and show consistency in order of the most important predictors.\n Our analysis shows that machine learning classifiers gaining high accuracy tend to mimic physical principles behind different logging tools, in particular: the longer the travel time of an acoustic wave the higher probability that media is represented by reservoir rock and vice versa. On the contrary lower values of natural radioactivity and density of rock highlight the presence of a reservoir.\n The article presents causal inference analysis of ML classification models using Shapley values on 2 real-world reservoirs. The rock class labels from core data are used to train a supervised machine learning algorithm to predict classes from well log response. The aim of supervised learning is to label a small portion of a dataset and allow the algorithm to automate the rest. Such data-driven analysis may optimize well logging, coring, and core analysis programs. This algorithm can be extended to any other reservoir to improve rock type prediction.\n The novelty of the paper is that such analysis reveals the nature of decisions made by the ML model and allows to apply truly robust and reliable petrophysics-consistent ML models for rock type classification.","PeriodicalId":10959,"journal":{"name":"Day 3 Wed, November 17, 2021","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Wed, November 17, 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/207707-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We propose a methodology to improve rock type classification using machine learning (ML) techniques and to reveal causal inferences between reservoir quality and well log measurements. Rock type classification is an essential step in accurate reservoir modeling and forecasting. Machine learning approaches allow to automate rock type classification based on different well logs and core data. In order to choose the best model which does not progradate uncertainty further into the workflow it is important to interpret machine learning results. Feature importance and feature selection methods are usually employed for that. We propose an extension to existing approaches - model agnostic sensitivity algorithm based on Shapley values.
The paper describes a full workflow to rock type prediction using well log data: from data preparation, model building, feature selection to causal inference analysis. We made ML models that classify rock types using well logs (sonic, gamma, density, photoelectric and resistivity) from 21 wells as predictors and conduct a causal inference analysis between reservoir quality and well logs responses using Shapley values (a concept from a game theory). As a result of feature selection, we obtained predictors which are statistically significant and at the same time relevant in causal relation context.
Macro F1-score of the best obtained models for both cases is 0.79 and 0.85 respectively. It was found that the ML models can infer domain knowledge, which allows us to confirm the adequacy of the built ML model for rock types prediction. Our insight was to recognize the need to properly account for the underlying causal structure between the features and rock types in order to derive meaningful and relevant predictors that carry a significant amount of information contributing to the final outcome. Also, we demonstrate the robustness of revealed patterns by applying the Shapley values methodology to a number of ML models and show consistency in order of the most important predictors.
Our analysis shows that machine learning classifiers gaining high accuracy tend to mimic physical principles behind different logging tools, in particular: the longer the travel time of an acoustic wave the higher probability that media is represented by reservoir rock and vice versa. On the contrary lower values of natural radioactivity and density of rock highlight the presence of a reservoir.
The article presents causal inference analysis of ML classification models using Shapley values on 2 real-world reservoirs. The rock class labels from core data are used to train a supervised machine learning algorithm to predict classes from well log response. The aim of supervised learning is to label a small portion of a dataset and allow the algorithm to automate the rest. Such data-driven analysis may optimize well logging, coring, and core analysis programs. This algorithm can be extended to any other reservoir to improve rock type prediction.
The novelty of the paper is that such analysis reveals the nature of decisions made by the ML model and allows to apply truly robust and reliable petrophysics-consistent ML models for rock type classification.