预测页岩中游离烃含量的可解释叠加系综模型

IF 5 2区地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY

Natural Resources Research Pub Date : 2025-09-03 DOI:10.1007/s11053-025-10553-3

Hang Liu, Sandong Zhou, Xinyu Liu, Qiaoyun Cheng, Weixin Zhang, Detian Yan, Hua Wang

{"title":"预测页岩中游离烃含量的可解释叠加系综模型","authors":"Hang Liu, Sandong Zhou, Xinyu Liu, Qiaoyun Cheng, Weixin Zhang, Detian Yan, Hua Wang","doi":"10.1007/s11053-025-10553-3","DOIUrl":null,"url":null,"abstract":"<p>Free hydrocarbons are among the fundamental indicators of shale organic matter richness and potential for hydrocarbon generation. The traditional experimental analysis method based on rock pyrolysis is time-consuming and expensive. This study aimed to predict free hydrocarbons in the Qingshankou Formation shale of the Changling Depression in the Songliao Basin. Using 521 sets of logging data as input, a stacking ensemble model for predicting shale free hydrocarbons content was developed based on six base learner models including decision tree (DT), random forest (RF), gradient boosting decision tree (GBDT), support vector machine (SVM), K-nearest neighbors (KNN), and artificial neural network (ANN), combined with meta model (linear regression). The performance analysis and ranking of models are based on three error evaluation metrics: coefficient of determination, root mean square error, and mean absolute error. The results indicated that model performance ranking from high to low was Stacking, RF, SVM, KNN, GBDT, ANN, and DT. The stacking ensemble model with the best performance was successfully applied to predict the free hydrocarbons curve on the connected well profile. Shapley additive explanations were used explain the best performing stacking ensemble model, and the results indicated that gamma ray log in the logging sequence contributed the most to the prediction of shale free hydrocarbons content. This study provides a model interpretation experience for predicting free hydrocarbons to help evaluate source rocks and select the “sweet spot” for shale oil.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"48 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Interpretable Stacking Ensemble Model for Predicting Free Hydrocarbons Content in Shale\",\"authors\":\"Hang Liu, Sandong Zhou, Xinyu Liu, Qiaoyun Cheng, Weixin Zhang, Detian Yan, Hua Wang\",\"doi\":\"10.1007/s11053-025-10553-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Free hydrocarbons are among the fundamental indicators of shale organic matter richness and potential for hydrocarbon generation. The traditional experimental analysis method based on rock pyrolysis is time-consuming and expensive. This study aimed to predict free hydrocarbons in the Qingshankou Formation shale of the Changling Depression in the Songliao Basin. Using 521 sets of logging data as input, a stacking ensemble model for predicting shale free hydrocarbons content was developed based on six base learner models including decision tree (DT), random forest (RF), gradient boosting decision tree (GBDT), support vector machine (SVM), K-nearest neighbors (KNN), and artificial neural network (ANN), combined with meta model (linear regression). The performance analysis and ranking of models are based on three error evaluation metrics: coefficient of determination, root mean square error, and mean absolute error. The results indicated that model performance ranking from high to low was Stacking, RF, SVM, KNN, GBDT, ANN, and DT. The stacking ensemble model with the best performance was successfully applied to predict the free hydrocarbons curve on the connected well profile. Shapley additive explanations were used explain the best performing stacking ensemble model, and the results indicated that gamma ray log in the logging sequence contributed the most to the prediction of shale free hydrocarbons content. This study provides a model interpretation experience for predicting free hydrocarbons to help evaluate source rocks and select the “sweet spot” for shale oil.</p>\",\"PeriodicalId\":54284,\"journal\":{\"name\":\"Natural Resources Research\",\"volume\":\"48 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2025-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Resources Research\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1007/s11053-025-10553-3\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-025-10553-3","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

游离烃是页岩有机质丰富度和生烃潜力的基本指标之一。传统的基于岩石热解的实验分析方法耗时长，成本高。本研究旨在对松辽盆地长岭凹陷青山口组页岩进行游离烃预测。以521组测井数据为输入，基于决策树（DT）、随机森林（RF）、梯度增强决策树（GBDT）、支持向量机（SVM）、k近邻（KNN）和人工神经网络（ANN） 6种基本学习模型，结合元模型（线性回归），建立了预测页岩游离烃含量的叠加集成模型。模型的性能分析和排序基于三个误差评价指标：决定系数、均方根误差和平均绝对误差。结果表明，模型性能从高到低依次为Stacking、RF、SVM、KNN、GBDT、ANN、DT。应用效果最好的叠加系综模型成功预测了连通井剖面上的游离烃曲线。采用Shapley加性解释解释了表现最好的叠加系综模型，结果表明，测井序列中的伽马测井对页岩游离烃含量的预测贡献最大。该研究为预测游离烃提供了模型解释经验，有助于评价烃源岩，选择页岩油的“甜点”。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Interpretable Stacking Ensemble Model for Predicting Free Hydrocarbons Content in Shale

Free hydrocarbons are among the fundamental indicators of shale organic matter richness and potential for hydrocarbon generation. The traditional experimental analysis method based on rock pyrolysis is time-consuming and expensive. This study aimed to predict free hydrocarbons in the Qingshankou Formation shale of the Changling Depression in the Songliao Basin. Using 521 sets of logging data as input, a stacking ensemble model for predicting shale free hydrocarbons content was developed based on six base learner models including decision tree (DT), random forest (RF), gradient boosting decision tree (GBDT), support vector machine (SVM), K-nearest neighbors (KNN), and artificial neural network (ANN), combined with meta model (linear regression). The performance analysis and ranking of models are based on three error evaluation metrics: coefficient of determination, root mean square error, and mean absolute error. The results indicated that model performance ranking from high to low was Stacking, RF, SVM, KNN, GBDT, ANN, and DT. The stacking ensemble model with the best performance was successfully applied to predict the free hydrocarbons curve on the connected well profile. Shapley additive explanations were used explain the best performing stacking ensemble model, and the results indicated that gamma ray log in the logging sequence contributed the most to the prediction of shale free hydrocarbons content. This study provides a model interpretation experience for predicting free hydrocarbons to help evaluate source rocks and select the “sweet spot” for shale oil.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Natural Resources Research Environmental Science-General Environmental Science

CiteScore

11.90

自引率

11.10%

发文量

151

期刊介绍： This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.