{"title":"与监督机器学习相关的输入数据质量对岩石层预测的影响","authors":"H. W. Bøe, K. B. Brandsegg, L. Marello, A. Črne","doi":"10.3997/2214-4609.201803032","DOIUrl":null,"url":null,"abstract":"We assess the importance of data availability and consistency prior to applying supervised machine learning for predicting lithoclasses from wireline logs. A dataset is pre-processed and used as training data by three machine learning models in order to investigate the sensitivity of the lithoclasses predictions. The first model uses the quality assured dataset without any modifications. The second model standardizes log signatures, whereas the third model uses the dataset in combination with additional features that dampens extreme outliers. The three models are evaluated against lithofacies interpretations based on CPI’s to show the varying predicting power of the models. The method is applied on a quality-controlled Jurassic interval dataset of ~100 exploration wells within a quadrant of the Norwegian part of the North Sea. The results shows that the number of wireline logs available has a direct influence on the prediction accuracy. For an acceptable prediction accuracy the wells should contain at least the gamma ray, density and neutron log. To distinguish between water-bearing and hydrocarbon-bearing intervals in sandstones the resistivity logs should also be present. When implementing machine learning on a regional scale we should consider varying burial depth and depositional environment in order to gain optimal predicting power.","PeriodicalId":231338,"journal":{"name":"First EAGE/PESGB Workshop Machine Learning","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Input Data Quality Influence On Lithoclass Predictions In Relation To Supervised Machine Learning\",\"authors\":\"H. W. Bøe, K. B. Brandsegg, L. Marello, A. Črne\",\"doi\":\"10.3997/2214-4609.201803032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We assess the importance of data availability and consistency prior to applying supervised machine learning for predicting lithoclasses from wireline logs. A dataset is pre-processed and used as training data by three machine learning models in order to investigate the sensitivity of the lithoclasses predictions. The first model uses the quality assured dataset without any modifications. The second model standardizes log signatures, whereas the third model uses the dataset in combination with additional features that dampens extreme outliers. The three models are evaluated against lithofacies interpretations based on CPI’s to show the varying predicting power of the models. The method is applied on a quality-controlled Jurassic interval dataset of ~100 exploration wells within a quadrant of the Norwegian part of the North Sea. The results shows that the number of wireline logs available has a direct influence on the prediction accuracy. For an acceptable prediction accuracy the wells should contain at least the gamma ray, density and neutron log. To distinguish between water-bearing and hydrocarbon-bearing intervals in sandstones the resistivity logs should also be present. When implementing machine learning on a regional scale we should consider varying burial depth and depositional environment in order to gain optimal predicting power.\",\"PeriodicalId\":231338,\"journal\":{\"name\":\"First EAGE/PESGB Workshop Machine Learning\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"First EAGE/PESGB Workshop Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3997/2214-4609.201803032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"First EAGE/PESGB Workshop Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3997/2214-4609.201803032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Input Data Quality Influence On Lithoclass Predictions In Relation To Supervised Machine Learning
We assess the importance of data availability and consistency prior to applying supervised machine learning for predicting lithoclasses from wireline logs. A dataset is pre-processed and used as training data by three machine learning models in order to investigate the sensitivity of the lithoclasses predictions. The first model uses the quality assured dataset without any modifications. The second model standardizes log signatures, whereas the third model uses the dataset in combination with additional features that dampens extreme outliers. The three models are evaluated against lithofacies interpretations based on CPI’s to show the varying predicting power of the models. The method is applied on a quality-controlled Jurassic interval dataset of ~100 exploration wells within a quadrant of the Norwegian part of the North Sea. The results shows that the number of wireline logs available has a direct influence on the prediction accuracy. For an acceptable prediction accuracy the wells should contain at least the gamma ray, density and neutron log. To distinguish between water-bearing and hydrocarbon-bearing intervals in sandstones the resistivity logs should also be present. When implementing machine learning on a regional scale we should consider varying burial depth and depositional environment in order to gain optimal predicting power.