Hang Liu, Sandong Zhou, Xinyu Liu, Qiaoyun Cheng, Weixin Zhang, Detian Yan, Hua Wang
{"title":"An Interpretable Stacking Ensemble Model for Predicting Free Hydrocarbons Content in Shale","authors":"Hang Liu, Sandong Zhou, Xinyu Liu, Qiaoyun Cheng, Weixin Zhang, Detian Yan, Hua Wang","doi":"10.1007/s11053-025-10553-3","DOIUrl":null,"url":null,"abstract":"<p>Free hydrocarbons are among the fundamental indicators of shale organic matter richness and potential for hydrocarbon generation. The traditional experimental analysis method based on rock pyrolysis is time-consuming and expensive. This study aimed to predict free hydrocarbons in the Qingshankou Formation shale of the Changling Depression in the Songliao Basin. Using 521 sets of logging data as input, a stacking ensemble model for predicting shale free hydrocarbons content was developed based on six base learner models including decision tree (DT), random forest (RF), gradient boosting decision tree (GBDT), support vector machine (SVM), K-nearest neighbors (KNN), and artificial neural network (ANN), combined with meta model (linear regression). The performance analysis and ranking of models are based on three error evaluation metrics: coefficient of determination, root mean square error, and mean absolute error. The results indicated that model performance ranking from high to low was Stacking, RF, SVM, KNN, GBDT, ANN, and DT. The stacking ensemble model with the best performance was successfully applied to predict the free hydrocarbons curve on the connected well profile. Shapley additive explanations were used explain the best performing stacking ensemble model, and the results indicated that gamma ray log in the logging sequence contributed the most to the prediction of shale free hydrocarbons content. This study provides a model interpretation experience for predicting free hydrocarbons to help evaluate source rocks and select the “sweet spot” for shale oil.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"48 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-025-10553-3","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Free hydrocarbons are among the fundamental indicators of shale organic matter richness and potential for hydrocarbon generation. The traditional experimental analysis method based on rock pyrolysis is time-consuming and expensive. This study aimed to predict free hydrocarbons in the Qingshankou Formation shale of the Changling Depression in the Songliao Basin. Using 521 sets of logging data as input, a stacking ensemble model for predicting shale free hydrocarbons content was developed based on six base learner models including decision tree (DT), random forest (RF), gradient boosting decision tree (GBDT), support vector machine (SVM), K-nearest neighbors (KNN), and artificial neural network (ANN), combined with meta model (linear regression). The performance analysis and ranking of models are based on three error evaluation metrics: coefficient of determination, root mean square error, and mean absolute error. The results indicated that model performance ranking from high to low was Stacking, RF, SVM, KNN, GBDT, ANN, and DT. The stacking ensemble model with the best performance was successfully applied to predict the free hydrocarbons curve on the connected well profile. Shapley additive explanations were used explain the best performing stacking ensemble model, and the results indicated that gamma ray log in the logging sequence contributed the most to the prediction of shale free hydrocarbons content. This study provides a model interpretation experience for predicting free hydrocarbons to help evaluate source rocks and select the “sweet spot” for shale oil.
期刊介绍:
This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.