Assessment of automated stratigraphic interpretations of boreholes with geology-informed metrics

IF 4.4 2区地球科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers & Geosciences Pub Date : 2025-09-10 DOI:10.1016/j.cageo.2025.106043

Sebastián Garzón , Willem Dabekaussen , Freek S. Busschers , Eva De Boever , Siamak Mehrkanoon , Derek Karssenberg

{"title":"Assessment of automated stratigraphic interpretations of boreholes with geology-informed metrics","authors":"Sebastián Garzón , Willem Dabekaussen , Freek S. Busschers , Eva De Boever , Siamak Mehrkanoon , Derek Karssenberg","doi":"10.1016/j.cageo.2025.106043","DOIUrl":null,"url":null,"abstract":"<div><div>Stratigraphic interpretation of borehole data is a fundamental aspect of subsurface geological models, providing critical insights into the distribution of stratigraphic units. However, expert interpretation of all available borehole data is impractical for large-scale regional mapping involving thousands of boreholes. Automated interpretations using machine learning models can significantly increase the number of boreholes included in subsurface geological models. Nevertheless, these predictions must adhere to strict spatial and stratigraphic relationships (e.g. superposition) to ensure geological plausibility, which often requires post-processing tasks. Traditional evaluation metrics commonly used for general-domain classification tasks (e.g. accuracy, F1-score) do not necessarily reflect the geological plausibility of predictions, as they fail to account for the sequential nature and spatial relationships inherent in borehole interpretation. To address this limitation, we propose and evaluate a set of geology-informed metrics that focus on three key aspects of stratigraphic interpretation, namely the expected geographical extent of units (extent metrics), their sequential relationships (sequence metrics), and their vertical positioning along boreholes (position metrics). Using a dataset of 1394 boreholes from the Cenozoic Roer Valley Graben (southeast Netherlands), which covers <span><math><mo>∼</mo></math></span>3000 km<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> and includes 15 lithostratigraphic units, we demonstrate that Random Forest and Neural Network models with similar performance on traditional metrics (e.g. accuracy, Cohen’s kappa, and F1-score) can differ significantly in their ability to produce geologically plausible predictions. For example, while many model configurations achieve <span><math><mo>∼</mo></math></span>75%–80% agreement between expected and predicted classes, the Neural Network models better capture the sequential stratigraphic relationships expected in the study area. Our results underscore the need for domain-specific metrics that offer a more accurate and interpretable assessment of model performance.</div></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"207 ","pages":"Article 106043"},"PeriodicalIF":4.4000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300425001931","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Stratigraphic interpretation of borehole data is a fundamental aspect of subsurface geological models, providing critical insights into the distribution of stratigraphic units. However, expert interpretation of all available borehole data is impractical for large-scale regional mapping involving thousands of boreholes. Automated interpretations using machine learning models can significantly increase the number of boreholes included in subsurface geological models. Nevertheless, these predictions must adhere to strict spatial and stratigraphic relationships (e.g. superposition) to ensure geological plausibility, which often requires post-processing tasks. Traditional evaluation metrics commonly used for general-domain classification tasks (e.g. accuracy, F1-score) do not necessarily reflect the geological plausibility of predictions, as they fail to account for the sequential nature and spatial relationships inherent in borehole interpretation. To address this limitation, we propose and evaluate a set of geology-informed metrics that focus on three key aspects of stratigraphic interpretation, namely the expected geographical extent of units (extent metrics), their sequential relationships (sequence metrics), and their vertical positioning along boreholes (position metrics). Using a dataset of 1394 boreholes from the Cenozoic Roer Valley Graben (southeast Netherlands), which covers

\sim

3000 km

^{2}

and includes 15 lithostratigraphic units, we demonstrate that Random Forest and Neural Network models with similar performance on traditional metrics (e.g. accuracy, Cohen’s kappa, and F1-score) can differ significantly in their ability to produce geologically plausible predictions. For example, while many model configurations achieve

\sim

75%–80% agreement between expected and predicted classes, the Neural Network models better capture the sequential stratigraphic relationships expected in the study area. Our results underscore the need for domain-specific metrics that offer a more accurate and interpretable assessment of model performance.

查看原文本刊更多论文

利用地质信息指标对钻孔进行自动地层解释的评估

钻孔资料的地层解释是地下地质模型的一个基本方面，为地层单元的分布提供了重要的见解。然而，对于涉及数千个钻孔的大规模区域测绘来说，专家解释所有可用的钻孔数据是不切实际的。使用机器学习模型的自动解释可以显著增加地下地质模型中包含的钻孔数量。然而，这些预测必须遵循严格的空间和地层关系（例如叠加），以确保地质上的合理性，这通常需要后处理任务。通常用于一般领域分类任务的传统评价指标（例如精度，f1分数）不一定反映预测的地质合理性，因为它们无法考虑井眼解释中固有的序列性质和空间关系。为了解决这一限制，我们提出并评估了一套地质信息指标，这些指标侧重于地层解释的三个关键方面，即单元的预期地理范围（范围指标），它们的序列关系（序列指标），以及它们沿钻孔的垂直定位（位置指标）。使用来自新生代Roer Valley地堑（荷兰东南部）的1394个钻孔数据集，覆盖约3000平方公里，包括15个岩石地层单元，我们证明随机森林和神经网络模型在传统指标（例如精度，Cohen 's kappa和f1分数）上具有相似性能，但在产生地质上合理的预测能力方面存在显着差异。例如，虽然许多模型配置在预期类别和预测类别之间实现了~ 75%-80%的一致性，但神经网络模型更好地捕捉了研究区域预期的顺序地层关系。我们的结果强调了对特定领域的度量的需求，这些度量提供了对模型性能的更准确和可解释的评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Geosciences 地学-地球科学综合

CiteScore

9.30

自引率

6.80%

发文量

164

审稿时长

3.4 months

期刊介绍： Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.