Application of Automated Machine Learning for Multi-Variate Prediction of Well Production

M. Maučec, S. Garni
{"title":"Application of Automated Machine Learning for Multi-Variate Prediction of Well Production","authors":"M. Maučec, S. Garni","doi":"10.2118/195022-MS","DOIUrl":null,"url":null,"abstract":"\n Performance evaluations of oil and gas assets are crucial for continuously improving operational efficiency in the mainstream petroleum industry. The success of such evaluations is largely driven by the analysis of the data accumulated during the asset's operational cycle. Usually, the amount of data stored in the databases dramatically exceeds the ability to approach the analysis with traditional spreadsheet-based tools or linear modeling. In this study we use data mining with multivariate predictive analytics and monetize on the value of data by transforming the inferred information into knowledge and further into rigorous business decisions.\n With the expansion of the Digital Oil Field and transformation into the 4th Industrial Revolution, the oil and gas industry is acquiring tremendous amounts of data that come from disparate sources in a variety of origins, time scales, structures and quality. The underlying variable root-cause relationships are highly non-linear and non-intuitive, and simplistic linear regression methods are suboptimal. We approach the challenge by developing a data-driven workflow that integrates components of artificial intelligence, machine learning and pattern recognition to enhance quantitative understanding of complex data.\n The sanitized aggregated data set combines 470 horizontal wells, covering 15 numerical (e.g., stimulation interval length, production rates) and categorical (e.g., target zone, proppant type) predictors and the total produced BOE, as the response variable. The objective is to predict an optimal set of variables that maximize the production. We utilize an integrated analytics platform that enables a variety of sophisticated statistical operations on large-scale data: a) comprehensive data QA/QC for outliers, consistency and missing entries; b) Exploratory Data Analysis and visualization; c) feature selection, screening and ranking; d) building and training of multiple machine learning (ML) models for multi-variate regression (e.g. generalized linear model, deep learning, decision tree, random forest and gradient boosted machine); and e) response optimization of an identified \"best-performing\" ML model for highest prediction accuracy.\n Our study introduces the initiative to establish concepts best practices for predictive and prescriptive analytics in domains of reservoir simulation, description and asset management. Given the unique volume and information richness of operational data, acquired over decades of production history, the anticipated applications of predictive analytics could expand to drilling optimization, smart data aggregation, well stimulation and equipment maintenance.","PeriodicalId":11321,"journal":{"name":"Day 3 Wed, March 20, 2019","volume":"323 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Wed, March 20, 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/195022-MS","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Performance evaluations of oil and gas assets are crucial for continuously improving operational efficiency in the mainstream petroleum industry. The success of such evaluations is largely driven by the analysis of the data accumulated during the asset's operational cycle. Usually, the amount of data stored in the databases dramatically exceeds the ability to approach the analysis with traditional spreadsheet-based tools or linear modeling. In this study we use data mining with multivariate predictive analytics and monetize on the value of data by transforming the inferred information into knowledge and further into rigorous business decisions. With the expansion of the Digital Oil Field and transformation into the 4th Industrial Revolution, the oil and gas industry is acquiring tremendous amounts of data that come from disparate sources in a variety of origins, time scales, structures and quality. The underlying variable root-cause relationships are highly non-linear and non-intuitive, and simplistic linear regression methods are suboptimal. We approach the challenge by developing a data-driven workflow that integrates components of artificial intelligence, machine learning and pattern recognition to enhance quantitative understanding of complex data. The sanitized aggregated data set combines 470 horizontal wells, covering 15 numerical (e.g., stimulation interval length, production rates) and categorical (e.g., target zone, proppant type) predictors and the total produced BOE, as the response variable. The objective is to predict an optimal set of variables that maximize the production. We utilize an integrated analytics platform that enables a variety of sophisticated statistical operations on large-scale data: a) comprehensive data QA/QC for outliers, consistency and missing entries; b) Exploratory Data Analysis and visualization; c) feature selection, screening and ranking; d) building and training of multiple machine learning (ML) models for multi-variate regression (e.g. generalized linear model, deep learning, decision tree, random forest and gradient boosted machine); and e) response optimization of an identified "best-performing" ML model for highest prediction accuracy. Our study introduces the initiative to establish concepts best practices for predictive and prescriptive analytics in domains of reservoir simulation, description and asset management. Given the unique volume and information richness of operational data, acquired over decades of production history, the anticipated applications of predictive analytics could expand to drilling optimization, smart data aggregation, well stimulation and equipment maintenance.
自动化机器学习在油井生产多变量预测中的应用
在主流石油行业中,油气资产绩效评价对于持续提高作业效率至关重要。这种评估的成功在很大程度上取决于对资产运营周期中积累的数据的分析。通常,存储在数据库中的数据量大大超过了使用传统的基于电子表格的工具或线性建模进行分析的能力。在本研究中,我们将数据挖掘与多元预测分析相结合,通过将推断的信息转化为知识并进一步转化为严格的业务决策,从而实现数据价值的货币化。随着数字油田的扩展和第四次工业革命的到来,油气行业正在获取大量来自不同来源、不同时间尺度、不同结构和不同质量的数据。潜在变量的根本原因关系是高度非线性和非直观的,简单的线性回归方法是次优的。我们通过开发数据驱动的工作流来应对这一挑战,该工作流集成了人工智能、机器学习和模式识别的组件,以增强对复杂数据的定量理解。经过处理的汇总数据集包含470口水平井,包括15个数值预测指标(如增产段长度、产量)和分类预测指标(如目标层、支撑剂类型),以及作为响应变量的总产油量。目标是预测一组使产量最大化的最优变量。我们利用一个集成的分析平台,可以对大规模数据进行各种复杂的统计操作:a)对异常值、一致性和缺失条目进行全面的数据QA/QC;b)探索性数据分析和可视化;C)特征选择、筛选和排序;d)建立和训练用于多变量回归的多个机器学习(ML)模型(如广义线性模型、深度学习、决策树、随机森林和梯度增强机);e)对已确定的“最佳表现”ML模型进行响应优化,以获得最高的预测精度。我们的研究介绍了在油藏模拟、描述和资产管理领域建立预测和规范分析的概念和最佳实践的倡议。考虑到数十年生产历史中获得的作业数据的独特数量和信息丰富性,预测分析的预期应用可以扩展到钻井优化、智能数据聚合、油井增产和设备维护。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信