Sebastián Garzón , Willem Dabekaussen , Freek S. Busschers , Eva De Boever , Siamak Mehrkanoon , Derek Karssenberg
{"title":"Assessment of automated stratigraphic interpretations of boreholes with geology-informed metrics","authors":"Sebastián Garzón , Willem Dabekaussen , Freek S. Busschers , Eva De Boever , Siamak Mehrkanoon , Derek Karssenberg","doi":"10.1016/j.cageo.2025.106043","DOIUrl":null,"url":null,"abstract":"<div><div>Stratigraphic interpretation of borehole data is a fundamental aspect of subsurface geological models, providing critical insights into the distribution of stratigraphic units. However, expert interpretation of all available borehole data is impractical for large-scale regional mapping involving thousands of boreholes. Automated interpretations using machine learning models can significantly increase the number of boreholes included in subsurface geological models. Nevertheless, these predictions must adhere to strict spatial and stratigraphic relationships (e.g. superposition) to ensure geological plausibility, which often requires post-processing tasks. Traditional evaluation metrics commonly used for general-domain classification tasks (e.g. accuracy, F1-score) do not necessarily reflect the geological plausibility of predictions, as they fail to account for the sequential nature and spatial relationships inherent in borehole interpretation. To address this limitation, we propose and evaluate a set of geology-informed metrics that focus on three key aspects of stratigraphic interpretation, namely the expected geographical extent of units (extent metrics), their sequential relationships (sequence metrics), and their vertical positioning along boreholes (position metrics). Using a dataset of 1394 boreholes from the Cenozoic Roer Valley Graben (southeast Netherlands), which covers <span><math><mo>∼</mo></math></span>3000 km<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> and includes 15 lithostratigraphic units, we demonstrate that Random Forest and Neural Network models with similar performance on traditional metrics (e.g. accuracy, Cohen’s kappa, and F1-score) can differ significantly in their ability to produce geologically plausible predictions. For example, while many model configurations achieve <span><math><mo>∼</mo></math></span>75%–80% agreement between expected and predicted classes, the Neural Network models better capture the sequential stratigraphic relationships expected in the study area. Our results underscore the need for domain-specific metrics that offer a more accurate and interpretable assessment of model performance.</div></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"207 ","pages":"Article 106043"},"PeriodicalIF":4.4000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300425001931","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Stratigraphic interpretation of borehole data is a fundamental aspect of subsurface geological models, providing critical insights into the distribution of stratigraphic units. However, expert interpretation of all available borehole data is impractical for large-scale regional mapping involving thousands of boreholes. Automated interpretations using machine learning models can significantly increase the number of boreholes included in subsurface geological models. Nevertheless, these predictions must adhere to strict spatial and stratigraphic relationships (e.g. superposition) to ensure geological plausibility, which often requires post-processing tasks. Traditional evaluation metrics commonly used for general-domain classification tasks (e.g. accuracy, F1-score) do not necessarily reflect the geological plausibility of predictions, as they fail to account for the sequential nature and spatial relationships inherent in borehole interpretation. To address this limitation, we propose and evaluate a set of geology-informed metrics that focus on three key aspects of stratigraphic interpretation, namely the expected geographical extent of units (extent metrics), their sequential relationships (sequence metrics), and their vertical positioning along boreholes (position metrics). Using a dataset of 1394 boreholes from the Cenozoic Roer Valley Graben (southeast Netherlands), which covers 3000 km and includes 15 lithostratigraphic units, we demonstrate that Random Forest and Neural Network models with similar performance on traditional metrics (e.g. accuracy, Cohen’s kappa, and F1-score) can differ significantly in their ability to produce geologically plausible predictions. For example, while many model configurations achieve 75%–80% agreement between expected and predicted classes, the Neural Network models better capture the sequential stratigraphic relationships expected in the study area. Our results underscore the need for domain-specific metrics that offer a more accurate and interpretable assessment of model performance.
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.