The Impact of Data Splitting Strategy on Drilling Rate Prediction in the Rumaila Oil Field

IF 1.3 4区 工程技术 Q3 CHEMISTRY, ORGANIC
Ameen Kareem Salih, Ali Khaleel Faraj, Mohammed A. Ahmed, Ali Nahi Abed Al-Hasnawi
{"title":"The Impact of Data Splitting Strategy on Drilling Rate Prediction in the Rumaila Oil Field","authors":"Ameen Kareem Salih,&nbsp;Ali Khaleel Faraj,&nbsp;Mohammed A. Ahmed,&nbsp;Ali Nahi Abed Al-Hasnawi","doi":"10.1134/S0965544124050025","DOIUrl":null,"url":null,"abstract":"<p>Supervised machine learning is one of the important tools that has helped solve many problems facing humanity, especially problems that cannot be solved by humans. Building a successful and high-accuracy model depends on several things, such as the collected data, choosing the appropriate model, the method of data splitting to be used in training and evaluating the model, and choosing the appropriate hyperparameters. Data splitting is one of the most important things to do to obtain a high-accuracy model and to avoid overfitting, which produces a model with high training accuracy but fails in testing and prediction. This paper investigates the impact of different data splitting strategies such as hold-out with different testing sizes, K-Fold, and shuffle split on the effectiveness of a supervised machine learning model for prediction drilling rate in Rumaila oil field in southern Iraq and selecting the optimal data splitting strategy. The highest testing accuracy obtained was 0.827 when the shuffle split strategy was used.</p>","PeriodicalId":725,"journal":{"name":"Petroleum Chemistry","volume":"64 7","pages":"781 - 786"},"PeriodicalIF":1.3000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Petroleum Chemistry","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1134/S0965544124050025","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, ORGANIC","Score":null,"Total":0}
引用次数: 0

Abstract

Supervised machine learning is one of the important tools that has helped solve many problems facing humanity, especially problems that cannot be solved by humans. Building a successful and high-accuracy model depends on several things, such as the collected data, choosing the appropriate model, the method of data splitting to be used in training and evaluating the model, and choosing the appropriate hyperparameters. Data splitting is one of the most important things to do to obtain a high-accuracy model and to avoid overfitting, which produces a model with high training accuracy but fails in testing and prediction. This paper investigates the impact of different data splitting strategies such as hold-out with different testing sizes, K-Fold, and shuffle split on the effectiveness of a supervised machine learning model for prediction drilling rate in Rumaila oil field in southern Iraq and selecting the optimal data splitting strategy. The highest testing accuracy obtained was 0.827 when the shuffle split strategy was used.

Abstract Image

数据分割策略对鲁迈拉油田钻井速率预测的影响
有监督机器学习是帮助解决人类面临的许多问题,尤其是人类无法解决的问题的重要工具之一。建立一个成功的高精度模型取决于几个方面,如收集的数据、选择合适的模型、用于训练和评估模型的数据分割方法以及选择合适的超参数。数据拆分是获得高精度模型和避免过拟合的最重要工作之一,过拟合会产生训练精度高但测试和预测失败的模型。本文研究了不同的数据拆分策略,如不同测试规模的hold-out、K-Fold和shuffle split,对有监督机器学习模型预测伊拉克南部鲁迈拉油田钻井率效果的影响,并选择了最佳的数据拆分策略。采用洗牌分割策略时,测试精度最高,为 0.827。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Petroleum Chemistry
Petroleum Chemistry 工程技术-工程:化工
CiteScore
2.50
自引率
21.40%
发文量
102
审稿时长
6-12 weeks
期刊介绍: Petroleum Chemistry (Neftekhimiya), founded in 1961, offers original papers on and reviews of theoretical and experimental studies concerned with current problems of petroleum chemistry and processing such as chemical composition of crude oils and natural gas liquids; petroleum refining (cracking, hydrocracking, and catalytic reforming); catalysts for petrochemical processes (hydrogenation, isomerization, oxidation, hydroformylation, etc.); activation and catalytic transformation of hydrocarbons and other components of petroleum, natural gas, and other complex organic mixtures; new petrochemicals including lubricants and additives; environmental problems; and information on scientific meetings relevant to these areas. Petroleum Chemistry publishes articles on these topics from members of the scientific community of the former Soviet Union.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信