{"title":"Facies classification using k-means clustering algorithm in Mara Field, Niger Delta, Nigeria","authors":"Esther Kerubo, Moruffdeen Adedapo Adabanija, Olatunbosun Adedayo Alao","doi":"10.1007/s12517-025-12311-4","DOIUrl":null,"url":null,"abstract":"<div><p>An integrated <i>k</i>-means clustering of well log data from Mara field, Niger Delta, Nigeria, has been carried out. This is with a view to segmenting well data into different facies based on their physical and geological properties. The relationship between the cluster labels and the facies types was studied using cross-plots, histograms, and statistical analysis. The results obtained from cluster prediction were compared with conventional methods of well log interpretation. Three well datasets from Mara field (Mara-1, Mara-2, and Mara-3) containing gamma ray, neutron porosity, density, and deep resistivity logs were used. The well data was subjected to data preprocessing, exploratory data analysis, outlier detection and removal, feature selection, and scaling to make the data more suitable for machine learning (ML) methods. Due to missing data in density and neutron porosity logs that might have occurred as a result of various reasons, including tool failures, depth misalignments, and manual removal of bad data, Mara-3 well was dropped for clustering as the issue could significantly impact petrophysical analyses and machine learning model performance. The <i>k</i>-means clustering algorithm was implemented using the Scikit-learn library. The elbow method and silhouette score were then applied to cluster the datasets as well as evaluate the number of clusters. The elbow method approximated the cluster level to be at 3, while with further evaluation, the silhouette score gave the optimum level of clustering with its highest value at cluster level of 2. A cluster level of 2 was selected to be the best with the highest score of 0.552, denoting that the data points are very compact within the cluster to which they belong. Based on the clustering results, different facies (shale and sandstone) were recognized successfully. The reservoir unit of sandstone and shale intercalations was delineated from the two wells and a dynamic depositional environment. Comparison of the identified facies units with conventional method of interpretation showed that the <i>k</i>-means algorithm was able to cluster the data and correlate them with depth.</p></div>","PeriodicalId":476,"journal":{"name":"Arabian Journal of Geosciences","volume":"18 9","pages":""},"PeriodicalIF":1.8270,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal of Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s12517-025-12311-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
An integrated k-means clustering of well log data from Mara field, Niger Delta, Nigeria, has been carried out. This is with a view to segmenting well data into different facies based on their physical and geological properties. The relationship between the cluster labels and the facies types was studied using cross-plots, histograms, and statistical analysis. The results obtained from cluster prediction were compared with conventional methods of well log interpretation. Three well datasets from Mara field (Mara-1, Mara-2, and Mara-3) containing gamma ray, neutron porosity, density, and deep resistivity logs were used. The well data was subjected to data preprocessing, exploratory data analysis, outlier detection and removal, feature selection, and scaling to make the data more suitable for machine learning (ML) methods. Due to missing data in density and neutron porosity logs that might have occurred as a result of various reasons, including tool failures, depth misalignments, and manual removal of bad data, Mara-3 well was dropped for clustering as the issue could significantly impact petrophysical analyses and machine learning model performance. The k-means clustering algorithm was implemented using the Scikit-learn library. The elbow method and silhouette score were then applied to cluster the datasets as well as evaluate the number of clusters. The elbow method approximated the cluster level to be at 3, while with further evaluation, the silhouette score gave the optimum level of clustering with its highest value at cluster level of 2. A cluster level of 2 was selected to be the best with the highest score of 0.552, denoting that the data points are very compact within the cluster to which they belong. Based on the clustering results, different facies (shale and sandstone) were recognized successfully. The reservoir unit of sandstone and shale intercalations was delineated from the two wells and a dynamic depositional environment. Comparison of the identified facies units with conventional method of interpretation showed that the k-means algorithm was able to cluster the data and correlate them with depth.
期刊介绍:
The Arabian Journal of Geosciences is the official journal of the Saudi Society for Geosciences and publishes peer-reviewed original and review articles on the entire range of Earth Science themes, focused on, but not limited to, those that have regional significance to the Middle East and the Euro-Mediterranean Zone.
Key topics therefore include; geology, hydrogeology, earth system science, petroleum sciences, geophysics, seismology and crustal structures, tectonics, sedimentology, palaeontology, metamorphic and igneous petrology, natural hazards, environmental sciences and sustainable development, geoarchaeology, geomorphology, paleo-environment studies, oceanography, atmospheric sciences, GIS and remote sensing, geodesy, mineralogy, volcanology, geochemistry and metallogenesis.