Reconstruction of Missing Segments in Well Data History Using Data Analytics

Yuanjun Li, R. Horne, A. Al Shmakhy, Tania Felix Menchaca
{"title":"Reconstruction of Missing Segments in Well Data History Using Data Analytics","authors":"Yuanjun Li, R. Horne, A. Al Shmakhy, Tania Felix Menchaca","doi":"10.2118/208137-ms","DOIUrl":null,"url":null,"abstract":"\n The problem of missing data is a frequent occurrence in well production history records. Due to network outage, facility maintenance or equipment failure, the time series production data measured from surface and downhole gauges can be intermittent. The fragmentary data are an obstacle for reservoir management. The incomplete dataset is commonly simplified by omitting all observations with missing values, which will lead to significant information loss. Thus, to fill the missing data gaps, in this study, we developed and tested several missing data imputation approaches using machine learning and deep learning methods.\n Traditional data imputation methods such as interpolation and counting most frequent values can introduce bias to the data as the correlations between features are not considered. Thus, in this study, we investigated several multivariate imputation algorithms that use the entire set of available data streams to estimate the missing values. The methods use a full suite of well measurements, including wellhead and downhole pressures, oil, water and gas flow rates, surface and downhole temperatures, choke settings, etc. Any parameter that has gaps in its recorded history can be imputed from the other available data streams.\n The models were tested on both synthetic and real datasets from operating Norwegian and Abu Dhabi reservoirs. Based on the characteristics of the field data, we introduced different types of continuous missing distributions, which are the combinations of single-multiple missing sections in a long-short time span, to the complete dataset. We observed that as the missing time span expands, the stability of the more successful methods can be kept to a threshold of 30% of the entire dataset. In addition, for a single missing section over a shorter period, which could represent a weather perturbation, most methods we tried were able to achieve high imputation accuracy. In the case of multiple missing sections over a longer time span, which is typical of gauge failures, other methods were better candidates to capture the overall correlation in the multivariate dataset.\n Most missing data problems addressed in our industry focus on single feature imputation. In this study, we developed an efficient procedure that enables fast reconstruction of the entire production dataset with multiple missing sections in different variables. Ultimately, the complete information can support the reservoir history matching process, production allocation, and develop models for reservoir performance prediction.","PeriodicalId":10959,"journal":{"name":"Day 3 Wed, November 17, 2021","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Wed, November 17, 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/208137-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The problem of missing data is a frequent occurrence in well production history records. Due to network outage, facility maintenance or equipment failure, the time series production data measured from surface and downhole gauges can be intermittent. The fragmentary data are an obstacle for reservoir management. The incomplete dataset is commonly simplified by omitting all observations with missing values, which will lead to significant information loss. Thus, to fill the missing data gaps, in this study, we developed and tested several missing data imputation approaches using machine learning and deep learning methods. Traditional data imputation methods such as interpolation and counting most frequent values can introduce bias to the data as the correlations between features are not considered. Thus, in this study, we investigated several multivariate imputation algorithms that use the entire set of available data streams to estimate the missing values. The methods use a full suite of well measurements, including wellhead and downhole pressures, oil, water and gas flow rates, surface and downhole temperatures, choke settings, etc. Any parameter that has gaps in its recorded history can be imputed from the other available data streams. The models were tested on both synthetic and real datasets from operating Norwegian and Abu Dhabi reservoirs. Based on the characteristics of the field data, we introduced different types of continuous missing distributions, which are the combinations of single-multiple missing sections in a long-short time span, to the complete dataset. We observed that as the missing time span expands, the stability of the more successful methods can be kept to a threshold of 30% of the entire dataset. In addition, for a single missing section over a shorter period, which could represent a weather perturbation, most methods we tried were able to achieve high imputation accuracy. In the case of multiple missing sections over a longer time span, which is typical of gauge failures, other methods were better candidates to capture the overall correlation in the multivariate dataset. Most missing data problems addressed in our industry focus on single feature imputation. In this study, we developed an efficient procedure that enables fast reconstruction of the entire production dataset with multiple missing sections in different variables. Ultimately, the complete information can support the reservoir history matching process, production allocation, and develop models for reservoir performance prediction.
利用数据分析技术重建井史数据缺失段
数据丢失是油井生产历史记录中经常出现的问题。由于网络中断、设施维护或设备故障,从地面和井下仪表测量的时序生产数据可能是间歇性的。数据不完整是油藏管理的一大障碍。对于不完整的数据集,通常通过省略所有缺失值的观测值来简化数据集,这将导致严重的信息丢失。因此,为了填补缺失的数据空白,在本研究中,我们开发并测试了几种使用机器学习和深度学习方法的缺失数据插入方法。传统的数据输入方法,如插值和计算最频繁的值,由于没有考虑特征之间的相关性,会给数据带来偏差。因此,在本研究中,我们研究了几种使用整个可用数据流集来估计缺失值的多元imputation算法。该方法使用全套的井测量,包括井口和井下压力、油、水和气的流速、地面和井下温度、节流器设置等。任何在其记录历史中有间隙的参数都可以从其他可用的数据流中推算出来。这些模型在挪威和阿布扎比油藏的合成数据集和真实数据集上进行了测试。根据野外数据的特点,在完整数据集上引入了不同类型的连续缺失分布,即在长-短时间跨度内单-多缺失部分的组合。我们观察到,随着缺失时间跨度的扩大,更成功的方法的稳定性可以保持在整个数据集的30%的阈值。此外,对于较短时间内可能代表天气扰动的单个缺失部分,我们尝试的大多数方法都能够获得较高的imputation精度。在较长时间跨度内多个缺失部分的情况下,这是典型的仪表故障,其他方法是更好的候选方法,可以捕获多变量数据集中的整体相关性。在我们的行业中,大多数丢失数据的问题都集中在单一特征的输入上。在这项研究中,我们开发了一种高效的程序,可以快速重建整个生产数据集,其中包含不同变量中的多个缺失部分。最终,完整的信息可以支持油藏历史匹配过程、产量分配,并建立油藏动态预测模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信