优化的特征选择有助于岩相机器学习，并结合稀疏测井数据和分级河流层序的计算属性

Artificial Intelligence in Geosciences Pub Date : 2022-12-01 DOI:10.1016/j.aiig.2022.11.003

David A. Wood

{"title":"优化的特征选择有助于岩相机器学习，并结合稀疏测井数据和分级河流层序的计算属性","authors":"David A. Wood","doi":"10.1016/j.aiig.2022.11.003","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning (ML) to predict lithofacies from sparse suites of well-log data is difficult in laterally and vertically heterogeneous reservoir formations in oil and gas fields. Meandering, braided fluviatile depositional environments tend to form clastic sequences with laterally discontinuous layers due to the continuous shifting of relatively narrow sandstone channels. Three cored wellbores drilled through such a reservoir in a large oil field, with just four recorded well logs available, are used to classify four lithofacies using ML models. To augment the well-log data, six derivative and volatility attributes were calculated from the recorded gamma ray and density logs, providing sixteen log features for the ML models to select from. A novel, multiple-optimizer feature selection technique was developed to identify high-performing feature combinations with which seven ML models were used to predict lithofacies assisted by multi-k-fold cross validation. Feature combinations with just seven to nine selected log features achieved overall ML lithofacies accuracy of 0.87 for two wells used for training and validation. When the trained ML models were applied to a third well for testing, lithofacies ML prediction accuracy declined to 0.65 for the best performing extreme gradient boosting model with seven features. However, an accuracy of ∼0.76 was achieved by that model in predicting the presence of the pay bearing sandstone and siltstone lithofacies in the test well. A model using only the four recorded well logs was only able to predict the pay-bearing lithofacies with ∼0.6 accuracy. Annotated confusion matrices and feature importance analysis provide additional insight to ML model performance and identify the log attributes that are most influential in enhancing lithofacies prediction.</p></div>","PeriodicalId":100124,"journal":{"name":"Artificial Intelligence in Geosciences","volume":"3 ","pages":"Pages 132-147"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666544122000326/pdfft?md5=47841f260127b1f2246f19d39a782263&pid=1-s2.0-S2666544122000326-main.pdf","citationCount":"2","resultStr":"{\"title\":\"Optimized feature selection assists lithofacies machine learning with sparse well log data combined with calculated attributes in a gradational fluvial sequence\",\"authors\":\"David A. Wood\",\"doi\":\"10.1016/j.aiig.2022.11.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Machine learning (ML) to predict lithofacies from sparse suites of well-log data is difficult in laterally and vertically heterogeneous reservoir formations in oil and gas fields. Meandering, braided fluviatile depositional environments tend to form clastic sequences with laterally discontinuous layers due to the continuous shifting of relatively narrow sandstone channels. Three cored wellbores drilled through such a reservoir in a large oil field, with just four recorded well logs available, are used to classify four lithofacies using ML models. To augment the well-log data, six derivative and volatility attributes were calculated from the recorded gamma ray and density logs, providing sixteen log features for the ML models to select from. A novel, multiple-optimizer feature selection technique was developed to identify high-performing feature combinations with which seven ML models were used to predict lithofacies assisted by multi-k-fold cross validation. Feature combinations with just seven to nine selected log features achieved overall ML lithofacies accuracy of 0.87 for two wells used for training and validation. When the trained ML models were applied to a third well for testing, lithofacies ML prediction accuracy declined to 0.65 for the best performing extreme gradient boosting model with seven features. However, an accuracy of ∼0.76 was achieved by that model in predicting the presence of the pay bearing sandstone and siltstone lithofacies in the test well. A model using only the four recorded well logs was only able to predict the pay-bearing lithofacies with ∼0.6 accuracy. Annotated confusion matrices and feature importance analysis provide additional insight to ML model performance and identify the log attributes that are most influential in enhancing lithofacies prediction.</p></div>\",\"PeriodicalId\":100124,\"journal\":{\"name\":\"Artificial Intelligence in Geosciences\",\"volume\":\"3 \",\"pages\":\"Pages 132-147\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666544122000326/pdfft?md5=47841f260127b1f2246f19d39a782263&pid=1-s2.0-S2666544122000326-main.pdf\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Geosciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666544122000326\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666544122000326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在油气田横向和纵向非均质储层中，利用稀疏测井数据预测岩相的机器学习(ML)是很困难的。曲流、辫状流质沉积环境由于相对狭窄的砂岩河道的不断移动，容易形成横向不连续层的碎屑层序。在一个大油田的储层中钻了三口取心井，只有四口测井记录，使用ML模型对四种岩相进行了分类。为了增加测井数据，从记录的伽马射线和密度测井数据中计算了6个导数和波动性属性，为ML模型提供了16个测井特征。开发了一种新型的多优化器特征选择技术，用于识别高性能特征组合，并用7个ML模型在多重交叉验证的辅助下预测岩相。在用于训练和验证的两口井中，仅使用7到9个选定的测井特征组合，就实现了0.87的总体ML岩相精度。当将训练好的ML模型应用于第三口井进行测试时，具有7个特征的极端梯度增强模型的岩相ML预测精度降至0.65。然而，该模型在预测测试井中是否存在含油层砂岩和粉砂岩岩相方面的精度达到了~ 0.76。仅使用4口记录的测井曲线的模型只能以~ 0.6的精度预测产油岩相。带注释的混淆矩阵和特征重要性分析为ML模型的性能提供了额外的见解，并确定了对增强岩相预测最有影响的日志属性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimized feature selection assists lithofacies machine learning with sparse well log data combined with calculated attributes in a gradational fluvial sequence

Machine learning (ML) to predict lithofacies from sparse suites of well-log data is difficult in laterally and vertically heterogeneous reservoir formations in oil and gas fields. Meandering, braided fluviatile depositional environments tend to form clastic sequences with laterally discontinuous layers due to the continuous shifting of relatively narrow sandstone channels. Three cored wellbores drilled through such a reservoir in a large oil field, with just four recorded well logs available, are used to classify four lithofacies using ML models. To augment the well-log data, six derivative and volatility attributes were calculated from the recorded gamma ray and density logs, providing sixteen log features for the ML models to select from. A novel, multiple-optimizer feature selection technique was developed to identify high-performing feature combinations with which seven ML models were used to predict lithofacies assisted by multi-k-fold cross validation. Feature combinations with just seven to nine selected log features achieved overall ML lithofacies accuracy of 0.87 for two wells used for training and validation. When the trained ML models were applied to a third well for testing, lithofacies ML prediction accuracy declined to 0.65 for the best performing extreme gradient boosting model with seven features. However, an accuracy of ∼0.76 was achieved by that model in predicting the presence of the pay bearing sandstone and siltstone lithofacies in the test well. A model using only the four recorded well logs was only able to predict the pay-bearing lithofacies with ∼0.6 accuracy. Annotated confusion matrices and feature importance analysis provide additional insight to ML model performance and identify the log attributes that are most influential in enhancing lithofacies prediction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence in Geosciences

CiteScore

4.20

自引率

0.00%

发文量