{"title":"A Novel Approach of Using Feature-Based Machine Learning Models to Expand Coverage of Oil Saturation from Dielectric Logs","authors":"Mohammed Alghazal, Dimitrios Krinis","doi":"10.2118/205162-ms","DOIUrl":null,"url":null,"abstract":"\n Dielectric log is a specialized tool with proprietary procedures to predict oil saturation independent of water salinity. Conventional resistivity logging is more routinely used but dependent on water salinity and Archie's parameters, leading to high measurement uncertainty in mixed salinity environments. This paper presents a novel machine learning approach of propagating the coverage of dielectric-based oil saturation driven by features extracted from commonly available reservoir information, petrophysical properties and conventional log data.\n More than 20 features were extracted from several sources. Based on sampling frequency, extracted features are divided into well-based discrete features and petrophysical-based continuous features. Examples of well-based features include well location with respect to flank (east or west), fluid viscosities and densities, total dissolved solids from surface water, distance to nearest water injector and injection volume. Petrophysical-based features include height above free water level (HAFWL), porosity, modelled permeability, initial water saturation, resistivity-based saturation, rock-type and caliper. In addition, we engineered two new depth-related and continuous features, we call them Height-Below-Crest (HBC) and Height-Above-Top-Injector-Zone (HATIZ).\n Initial data exploration was performed using Pearson's correlation heat map. Fluid densities and viscosities show strong correlation (60-80%) to the engineered features (HBC and HATIZ), which helped to capture the viscous and gravity forces effect across the well's vertical depth. The heat map also shows weak correlation between the features and the target variable, the oil saturation from dielectric log. The dataset, with 5000 samples, was randomly split into 80% training and 20% testing. A robust scaling technique to outliers is used to scale the features prior to modeling. The preliminary performance of various supervised machine learning models, including decision trees, ensemble methods, neural network and support vector machines, were benchmarked using K-Fold cross-validation on the training data prior to testing. Ensemble-based methods, random forest and gradient boosting, produced the least mean absolute error compared to other methods and thus were selected for further hyper-parameter tuning. Exhaustive grid search was performed on both models to find the best-fit parameters, achieving a correlation coefficient of 70% on the testing dataset. Features analysis indicate that the engineered features, HBC and HATIZ, along with the porosity, HAFWL and resistivity-based saturation are the most importance features for predicting the oil saturation from dielectric log.\n Dielectric log provides an edge over resistivity-based logging technique in mixed salinity formations, but with more elaborate interpretation procedures. In this paper, we present a soft-computing and economical alternative of using ensemble machine learning models to predict oil saturation from dielectric log given some extracted features from common reservoir information, petrophysical properties and conventional log data.","PeriodicalId":10904,"journal":{"name":"Day 2 Tue, October 19, 2021","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 2 Tue, October 19, 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/205162-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Dielectric log is a specialized tool with proprietary procedures to predict oil saturation independent of water salinity. Conventional resistivity logging is more routinely used but dependent on water salinity and Archie's parameters, leading to high measurement uncertainty in mixed salinity environments. This paper presents a novel machine learning approach of propagating the coverage of dielectric-based oil saturation driven by features extracted from commonly available reservoir information, petrophysical properties and conventional log data.
More than 20 features were extracted from several sources. Based on sampling frequency, extracted features are divided into well-based discrete features and petrophysical-based continuous features. Examples of well-based features include well location with respect to flank (east or west), fluid viscosities and densities, total dissolved solids from surface water, distance to nearest water injector and injection volume. Petrophysical-based features include height above free water level (HAFWL), porosity, modelled permeability, initial water saturation, resistivity-based saturation, rock-type and caliper. In addition, we engineered two new depth-related and continuous features, we call them Height-Below-Crest (HBC) and Height-Above-Top-Injector-Zone (HATIZ).
Initial data exploration was performed using Pearson's correlation heat map. Fluid densities and viscosities show strong correlation (60-80%) to the engineered features (HBC and HATIZ), which helped to capture the viscous and gravity forces effect across the well's vertical depth. The heat map also shows weak correlation between the features and the target variable, the oil saturation from dielectric log. The dataset, with 5000 samples, was randomly split into 80% training and 20% testing. A robust scaling technique to outliers is used to scale the features prior to modeling. The preliminary performance of various supervised machine learning models, including decision trees, ensemble methods, neural network and support vector machines, were benchmarked using K-Fold cross-validation on the training data prior to testing. Ensemble-based methods, random forest and gradient boosting, produced the least mean absolute error compared to other methods and thus were selected for further hyper-parameter tuning. Exhaustive grid search was performed on both models to find the best-fit parameters, achieving a correlation coefficient of 70% on the testing dataset. Features analysis indicate that the engineered features, HBC and HATIZ, along with the porosity, HAFWL and resistivity-based saturation are the most importance features for predicting the oil saturation from dielectric log.
Dielectric log provides an edge over resistivity-based logging technique in mixed salinity formations, but with more elaborate interpretation procedures. In this paper, we present a soft-computing and economical alternative of using ensemble machine learning models to predict oil saturation from dielectric log given some extracted features from common reservoir information, petrophysical properties and conventional log data.