A Novel Approach of Using Feature-Based Machine Learning Models to Expand Coverage of Oil Saturation from Dielectric Logs

Day 2 Tue, October 19, 2021 Pub Date : 2021-10-18 DOI:10.2118/205162-ms

Mohammed Alghazal, Dimitrios Krinis

{"title":"A Novel Approach of Using Feature-Based Machine Learning Models to Expand Coverage of Oil Saturation from Dielectric Logs","authors":"Mohammed Alghazal, Dimitrios Krinis","doi":"10.2118/205162-ms","DOIUrl":null,"url":null,"abstract":"\n Dielectric log is a specialized tool with proprietary procedures to predict oil saturation independent of water salinity. Conventional resistivity logging is more routinely used but dependent on water salinity and Archie's parameters, leading to high measurement uncertainty in mixed salinity environments. This paper presents a novel machine learning approach of propagating the coverage of dielectric-based oil saturation driven by features extracted from commonly available reservoir information, petrophysical properties and conventional log data.\n More than 20 features were extracted from several sources. Based on sampling frequency, extracted features are divided into well-based discrete features and petrophysical-based continuous features. Examples of well-based features include well location with respect to flank (east or west), fluid viscosities and densities, total dissolved solids from surface water, distance to nearest water injector and injection volume. Petrophysical-based features include height above free water level (HAFWL), porosity, modelled permeability, initial water saturation, resistivity-based saturation, rock-type and caliper. In addition, we engineered two new depth-related and continuous features, we call them Height-Below-Crest (HBC) and Height-Above-Top-Injector-Zone (HATIZ).\n Initial data exploration was performed using Pearson's correlation heat map. Fluid densities and viscosities show strong correlation (60-80%) to the engineered features (HBC and HATIZ), which helped to capture the viscous and gravity forces effect across the well's vertical depth. The heat map also shows weak correlation between the features and the target variable, the oil saturation from dielectric log. The dataset, with 5000 samples, was randomly split into 80% training and 20% testing. A robust scaling technique to outliers is used to scale the features prior to modeling. The preliminary performance of various supervised machine learning models, including decision trees, ensemble methods, neural network and support vector machines, were benchmarked using K-Fold cross-validation on the training data prior to testing. Ensemble-based methods, random forest and gradient boosting, produced the least mean absolute error compared to other methods and thus were selected for further hyper-parameter tuning. Exhaustive grid search was performed on both models to find the best-fit parameters, achieving a correlation coefficient of 70% on the testing dataset. Features analysis indicate that the engineered features, HBC and HATIZ, along with the porosity, HAFWL and resistivity-based saturation are the most importance features for predicting the oil saturation from dielectric log.\n Dielectric log provides an edge over resistivity-based logging technique in mixed salinity formations, but with more elaborate interpretation procedures. In this paper, we present a soft-computing and economical alternative of using ensemble machine learning models to predict oil saturation from dielectric log given some extracted features from common reservoir information, petrophysical properties and conventional log data.","PeriodicalId":10904,"journal":{"name":"Day 2 Tue, October 19, 2021","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 2 Tue, October 19, 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/205162-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Dielectric log is a specialized tool with proprietary procedures to predict oil saturation independent of water salinity. Conventional resistivity logging is more routinely used but dependent on water salinity and Archie's parameters, leading to high measurement uncertainty in mixed salinity environments. This paper presents a novel machine learning approach of propagating the coverage of dielectric-based oil saturation driven by features extracted from commonly available reservoir information, petrophysical properties and conventional log data. More than 20 features were extracted from several sources. Based on sampling frequency, extracted features are divided into well-based discrete features and petrophysical-based continuous features. Examples of well-based features include well location with respect to flank (east or west), fluid viscosities and densities, total dissolved solids from surface water, distance to nearest water injector and injection volume. Petrophysical-based features include height above free water level (HAFWL), porosity, modelled permeability, initial water saturation, resistivity-based saturation, rock-type and caliper. In addition, we engineered two new depth-related and continuous features, we call them Height-Below-Crest (HBC) and Height-Above-Top-Injector-Zone (HATIZ). Initial data exploration was performed using Pearson's correlation heat map. Fluid densities and viscosities show strong correlation (60-80%) to the engineered features (HBC and HATIZ), which helped to capture the viscous and gravity forces effect across the well's vertical depth. The heat map also shows weak correlation between the features and the target variable, the oil saturation from dielectric log. The dataset, with 5000 samples, was randomly split into 80% training and 20% testing. A robust scaling technique to outliers is used to scale the features prior to modeling. The preliminary performance of various supervised machine learning models, including decision trees, ensemble methods, neural network and support vector machines, were benchmarked using K-Fold cross-validation on the training data prior to testing. Ensemble-based methods, random forest and gradient boosting, produced the least mean absolute error compared to other methods and thus were selected for further hyper-parameter tuning. Exhaustive grid search was performed on both models to find the best-fit parameters, achieving a correlation coefficient of 70% on the testing dataset. Features analysis indicate that the engineered features, HBC and HATIZ, along with the porosity, HAFWL and resistivity-based saturation are the most importance features for predicting the oil saturation from dielectric log. Dielectric log provides an edge over resistivity-based logging technique in mixed salinity formations, but with more elaborate interpretation procedures. In this paper, we present a soft-computing and economical alternative of using ensemble machine learning models to predict oil saturation from dielectric log given some extracted features from common reservoir information, petrophysical properties and conventional log data.

查看原文本刊更多论文

一种利用基于特征的机器学习模型扩大介电测井含油饱和度覆盖范围的新方法

介电测井是一种专用工具，具有独立于水盐度预测油饱和度的专有程序。常规电阻率测井更常用，但依赖于水的盐度和Archie参数，导致在混合盐度环境中测量的不确定性很高。本文提出了一种新的机器学习方法，通过从常见的油藏信息、岩石物理性质和常规测井数据中提取的特征来扩展介电基油饱和度的覆盖范围。从几个来源提取了20多个特征。根据采样频率，将提取的特征分为基于井的离散特征和基于岩石物性的连续特征。基于井的特征包括井的位置(东或西)、流体粘度和密度、地表水的溶解固体总量、到最近的注水井的距离和注入量。岩石物理特征包括自由水位以上高度(HAFWL)、孔隙度、模拟渗透率、初始含水饱和度、基于电阻率的饱和度、岩石类型和井径。此外，我们还设计了两个新的与深度相关的连续特征，我们称之为波峰以下高度(HBC)和注入层顶部以上高度(HATIZ)。使用Pearson相关热图进行初始数据探索。流体密度和粘度与工程特征(HBC和HATIZ)有很强的相关性(60-80%)，这有助于捕捉整个井的垂直深度的粘性和重力效应。热图还显示了特征与目标变量——介电测井含油饱和度之间的弱相关性。数据集有5000个样本，随机分为80%的训练和20%的测试。在建模之前，使用了一种鲁棒的离群值缩放技术来缩放特征。各种监督机器学习模型(包括决策树、集成方法、神经网络和支持向量机)的初步性能在测试前使用K-Fold交叉验证对训练数据进行基准测试。与其他方法相比，基于集合的方法，随机森林和梯度增强，产生的平均绝对误差最小，因此被选中进行进一步的超参数调谐。对两个模型进行穷举网格搜索，找到最适合的参数，在测试数据集上实现了70%的相关系数。特征分析表明，工程特征、HBC和HATIZ以及孔隙度、HAFWL和电阻率饱和度是电介质测井预测含油饱和度的最重要特征。在混合矿化度地层中，介电测井比基于电阻率的测井技术更有优势，但解释过程更复杂。在本文中，我们提出了一种使用集成机器学习模型的软计算和经济替代方案，通过从普通储层信息、岩石物理性质和常规测井数据中提取一些特征，从介电测井中预测油饱和度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Day 2 Tue, October 19, 2021

自引率

0.00%

发文量