{"title":"Machine Learning Approach to Classify Water Cut Measurements using DAS Fiber Optic Data","authors":"M. Alkhalaf, F. Hveding, Muhmmad Arsalan","doi":"10.2118/197349-ms","DOIUrl":null,"url":null,"abstract":"\n A crucial part of optimizing well production is accurate flow metering for both onshore and offshore environments. The industry currently relies on test separators and multiphase meters. These methods have limitations in terms of cost, transportation and safety. In this paper, an alternative method to classify water cut measurement in oil wells based on Distributed Acoustic Sensing (DAS) data and machine learning will be discussed. Fiber optics is an effective tool to perform downhole logging, however, the challenge usually resides in the analysis and processing of the logged data. After performing a flowing survey on an oil well a dataset was developed using the logged DAS data in combination with production logging tool (PLT) measurements. After extraction, processing and labeling the raw DAS data, this dataset is used for training supervised machine learning models.\n In this paper, different classical machine learning models to train this dataset is assessed in terms of accuracy, speed and training/testing segments. The data gathered from the PLT shows a limitation in the variation of water cut percentages between the zones ranging from 71% to 76%. This limits our ability to assess the validity of the model, risk of overfitting, since most points share a similar target value. This is also reflected on the Rayleigh backscatter collected by the laser box where samples from different production zones share a similar value distribution across most frequency ranges. Three different classification machine learning models were selected simple Decision Tree and two ensemble method models—adaptive boost and Random Forest. The ensemble method models offer a parallel and sequential training schemes that increases the variance and reduce the bias in the model. After splitting and shuffling the data, were 10% of the original data was used for training, all models were trained in different percentages of the training set. Multiple metrics were chosen to assess the model's performance including accuracy, F-score and confusion matrices. Random forest classifier appears to be the best choice for this challenge, with a maximum accuracy of 98% and F-score of 0.99. The models show high dependency on low frequencies—lower than 500 Hz—where value distribution across production zones in DAS measurements is comparatively higher. Both ensemble method models are less bias with a maximum feature weight of about 0.1, in contrast, the simple Decision Tree model was highly dependent on a single frequency response. In future work, a more complex and diverse dataset will be collected from wells with a wider range of variances in terms of conditions and types. Moreover, after creating a more robust dataset alternative approaches can be assessed both classical machine learning models—regression and classification—and deep learning.","PeriodicalId":11091,"journal":{"name":"Day 3 Wed, November 13, 2019","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 3 Wed, November 13, 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/197349-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
A crucial part of optimizing well production is accurate flow metering for both onshore and offshore environments. The industry currently relies on test separators and multiphase meters. These methods have limitations in terms of cost, transportation and safety. In this paper, an alternative method to classify water cut measurement in oil wells based on Distributed Acoustic Sensing (DAS) data and machine learning will be discussed. Fiber optics is an effective tool to perform downhole logging, however, the challenge usually resides in the analysis and processing of the logged data. After performing a flowing survey on an oil well a dataset was developed using the logged DAS data in combination with production logging tool (PLT) measurements. After extraction, processing and labeling the raw DAS data, this dataset is used for training supervised machine learning models.
In this paper, different classical machine learning models to train this dataset is assessed in terms of accuracy, speed and training/testing segments. The data gathered from the PLT shows a limitation in the variation of water cut percentages between the zones ranging from 71% to 76%. This limits our ability to assess the validity of the model, risk of overfitting, since most points share a similar target value. This is also reflected on the Rayleigh backscatter collected by the laser box where samples from different production zones share a similar value distribution across most frequency ranges. Three different classification machine learning models were selected simple Decision Tree and two ensemble method models—adaptive boost and Random Forest. The ensemble method models offer a parallel and sequential training schemes that increases the variance and reduce the bias in the model. After splitting and shuffling the data, were 10% of the original data was used for training, all models were trained in different percentages of the training set. Multiple metrics were chosen to assess the model's performance including accuracy, F-score and confusion matrices. Random forest classifier appears to be the best choice for this challenge, with a maximum accuracy of 98% and F-score of 0.99. The models show high dependency on low frequencies—lower than 500 Hz—where value distribution across production zones in DAS measurements is comparatively higher. Both ensemble method models are less bias with a maximum feature weight of about 0.1, in contrast, the simple Decision Tree model was highly dependent on a single frequency response. In future work, a more complex and diverse dataset will be collected from wells with a wider range of variances in terms of conditions and types. Moreover, after creating a more robust dataset alternative approaches can be assessed both classical machine learning models—regression and classification—and deep learning.