利用DAS光纤数据对含水率测量进行分类的机器学习方法

M. Alkhalaf, F. Hveding, Muhmmad Arsalan
{"title":"利用DAS光纤数据对含水率测量进行分类的机器学习方法","authors":"M. Alkhalaf, F. Hveding, Muhmmad Arsalan","doi":"10.2118/197349-ms","DOIUrl":null,"url":null,"abstract":"\n A crucial part of optimizing well production is accurate flow metering for both onshore and offshore environments. The industry currently relies on test separators and multiphase meters. These methods have limitations in terms of cost, transportation and safety. In this paper, an alternative method to classify water cut measurement in oil wells based on Distributed Acoustic Sensing (DAS) data and machine learning will be discussed. Fiber optics is an effective tool to perform downhole logging, however, the challenge usually resides in the analysis and processing of the logged data. After performing a flowing survey on an oil well a dataset was developed using the logged DAS data in combination with production logging tool (PLT) measurements. After extraction, processing and labeling the raw DAS data, this dataset is used for training supervised machine learning models.\n In this paper, different classical machine learning models to train this dataset is assessed in terms of accuracy, speed and training/testing segments. The data gathered from the PLT shows a limitation in the variation of water cut percentages between the zones ranging from 71% to 76%. This limits our ability to assess the validity of the model, risk of overfitting, since most points share a similar target value. This is also reflected on the Rayleigh backscatter collected by the laser box where samples from different production zones share a similar value distribution across most frequency ranges. Three different classification machine learning models were selected simple Decision Tree and two ensemble method models—adaptive boost and Random Forest. The ensemble method models offer a parallel and sequential training schemes that increases the variance and reduce the bias in the model. After splitting and shuffling the data, were 10% of the original data was used for training, all models were trained in different percentages of the training set. Multiple metrics were chosen to assess the model's performance including accuracy, F-score and confusion matrices. Random forest classifier appears to be the best choice for this challenge, with a maximum accuracy of 98% and F-score of 0.99. The models show high dependency on low frequencies—lower than 500 Hz—where value distribution across production zones in DAS measurements is comparatively higher. Both ensemble method models are less bias with a maximum feature weight of about 0.1, in contrast, the simple Decision Tree model was highly dependent on a single frequency response. In future work, a more complex and diverse dataset will be collected from wells with a wider range of variances in terms of conditions and types. Moreover, after creating a more robust dataset alternative approaches can be assessed both classical machine learning models—regression and classification—and deep learning.","PeriodicalId":11091,"journal":{"name":"Day 3 Wed, November 13, 2019","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Machine Learning Approach to Classify Water Cut Measurements using DAS Fiber Optic Data\",\"authors\":\"M. Alkhalaf, F. Hveding, Muhmmad Arsalan\",\"doi\":\"10.2118/197349-ms\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n A crucial part of optimizing well production is accurate flow metering for both onshore and offshore environments. The industry currently relies on test separators and multiphase meters. These methods have limitations in terms of cost, transportation and safety. In this paper, an alternative method to classify water cut measurement in oil wells based on Distributed Acoustic Sensing (DAS) data and machine learning will be discussed. Fiber optics is an effective tool to perform downhole logging, however, the challenge usually resides in the analysis and processing of the logged data. After performing a flowing survey on an oil well a dataset was developed using the logged DAS data in combination with production logging tool (PLT) measurements. After extraction, processing and labeling the raw DAS data, this dataset is used for training supervised machine learning models.\\n In this paper, different classical machine learning models to train this dataset is assessed in terms of accuracy, speed and training/testing segments. The data gathered from the PLT shows a limitation in the variation of water cut percentages between the zones ranging from 71% to 76%. This limits our ability to assess the validity of the model, risk of overfitting, since most points share a similar target value. This is also reflected on the Rayleigh backscatter collected by the laser box where samples from different production zones share a similar value distribution across most frequency ranges. Three different classification machine learning models were selected simple Decision Tree and two ensemble method models—adaptive boost and Random Forest. The ensemble method models offer a parallel and sequential training schemes that increases the variance and reduce the bias in the model. After splitting and shuffling the data, were 10% of the original data was used for training, all models were trained in different percentages of the training set. Multiple metrics were chosen to assess the model's performance including accuracy, F-score and confusion matrices. Random forest classifier appears to be the best choice for this challenge, with a maximum accuracy of 98% and F-score of 0.99. The models show high dependency on low frequencies—lower than 500 Hz—where value distribution across production zones in DAS measurements is comparatively higher. Both ensemble method models are less bias with a maximum feature weight of about 0.1, in contrast, the simple Decision Tree model was highly dependent on a single frequency response. In future work, a more complex and diverse dataset will be collected from wells with a wider range of variances in terms of conditions and types. Moreover, after creating a more robust dataset alternative approaches can be assessed both classical machine learning models—regression and classification—and deep learning.\",\"PeriodicalId\":11091,\"journal\":{\"name\":\"Day 3 Wed, November 13, 2019\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Day 3 Wed, November 13, 2019\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2118/197349-ms\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Wed, November 13, 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/197349-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

优化油井生产的一个关键部分是对陆上和海上环境进行精确的流量测量。该行业目前依赖于测试分离器和多相计。这些方法在成本、运输和安全方面都有局限性。本文将讨论一种基于分布式声学传感(DAS)数据和机器学习的油井含水率测量分类方法。光纤是进行井下测井的一种有效工具,然而,其挑战通常在于对测井数据的分析和处理。在对一口油井进行流动测量后,利用DAS测井数据与生产测井工具(PLT)测量数据建立了数据集。在提取、处理和标记原始DAS数据后,该数据集用于训练有监督的机器学习模型。在本文中,从准确性、速度和训练/测试段方面评估了不同的经典机器学习模型来训练该数据集。从PLT收集的数据显示,层间含水率的变化有限,范围在71%至76%之间。这限制了我们评估模型有效性的能力,过度拟合的风险,因为大多数点都有相似的目标值。这也反映在激光箱收集的瑞利反向散射上,其中来自不同生产区域的样品在大多数频率范围内具有相似的值分布。选择了简单决策树和自适应增强和随机森林两种集成方法模型。集成方法模型提供了一种并行和顺序的训练方案,增加了模型中的方差,减少了模型中的偏差。对数据进行拆分和洗牌后,使用原始数据的10%进行训练,所有模型在训练集的不同百分比进行训练。选择多个指标来评估模型的性能,包括准确性,f得分和混淆矩阵。随机森林分类器似乎是这个挑战的最佳选择,其最大准确率为98%,f分数为0.99。模型显示高度依赖于低频(低于500赫兹),在DAS测量中,跨生产区域的值分布相对较高。两种集成方法模型偏差较小,最大特征权重约为0.1,而简单决策树模型高度依赖单一频率响应。在未来的工作中,将从条件和类型差异更大的井中收集更复杂、更多样化的数据集。此外,在创建更健壮的数据集之后,可以评估经典机器学习模型(回归和分类)和深度学习的替代方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine Learning Approach to Classify Water Cut Measurements using DAS Fiber Optic Data
A crucial part of optimizing well production is accurate flow metering for both onshore and offshore environments. The industry currently relies on test separators and multiphase meters. These methods have limitations in terms of cost, transportation and safety. In this paper, an alternative method to classify water cut measurement in oil wells based on Distributed Acoustic Sensing (DAS) data and machine learning will be discussed. Fiber optics is an effective tool to perform downhole logging, however, the challenge usually resides in the analysis and processing of the logged data. After performing a flowing survey on an oil well a dataset was developed using the logged DAS data in combination with production logging tool (PLT) measurements. After extraction, processing and labeling the raw DAS data, this dataset is used for training supervised machine learning models. In this paper, different classical machine learning models to train this dataset is assessed in terms of accuracy, speed and training/testing segments. The data gathered from the PLT shows a limitation in the variation of water cut percentages between the zones ranging from 71% to 76%. This limits our ability to assess the validity of the model, risk of overfitting, since most points share a similar target value. This is also reflected on the Rayleigh backscatter collected by the laser box where samples from different production zones share a similar value distribution across most frequency ranges. Three different classification machine learning models were selected simple Decision Tree and two ensemble method models—adaptive boost and Random Forest. The ensemble method models offer a parallel and sequential training schemes that increases the variance and reduce the bias in the model. After splitting and shuffling the data, were 10% of the original data was used for training, all models were trained in different percentages of the training set. Multiple metrics were chosen to assess the model's performance including accuracy, F-score and confusion matrices. Random forest classifier appears to be the best choice for this challenge, with a maximum accuracy of 98% and F-score of 0.99. The models show high dependency on low frequencies—lower than 500 Hz—where value distribution across production zones in DAS measurements is comparatively higher. Both ensemble method models are less bias with a maximum feature weight of about 0.1, in contrast, the simple Decision Tree model was highly dependent on a single frequency response. In future work, a more complex and diverse dataset will be collected from wells with a wider range of variances in terms of conditions and types. Moreover, after creating a more robust dataset alternative approaches can be assessed both classical machine learning models—regression and classification—and deep learning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信