一种新的类不平衡学习框架在流体识别中的应用——以鄂尔多斯盆地青石茂—高沙窝致密砂岩气藏为例

IF 4.4 2区地球科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers & Geosciences Pub Date : 2025-07-14 DOI:10.1016/j.cageo.2025.105993

Jun Yi , ZhongLi Qi , XiangChengZhen Li , Fuqiang Lai , Wei Zhou

{"title":"一种新的类不平衡学习框架在流体识别中的应用——以鄂尔多斯盆地青石茂—高沙窝致密砂岩气藏为例","authors":"Jun Yi , ZhongLi Qi , XiangChengZhen Li , Fuqiang Lai , Wei Zhou","doi":"10.1016/j.cageo.2025.105993","DOIUrl":null,"url":null,"abstract":"<div><div>The mathematical model-based methods used for conventional oil and gas resources often perform poorly in fluid recognition of tight-sand reservoir, due to the mutual interference of various factors such as reservoir lithology and pore structure. Booming artificial intelligence technologies and accumulating logging data provide a solid foundation for the application of machine learning methods as new tools for fluid identification. However, there is often a serious class imbalance, which can easily lead to the inability to achieve ideal classification results, in the proportion of categories of the collected well logging data. Consequently, this issue has become a huge challenge for the academic and industrial communities. To address this, a novel class-imbalance learning framework for fluid recognition (CILF) is proposed to tight-sand gas reservoirs of Qingshimao-Gaoshawo area of Ordos Basin, in China. Specifically, an improved label propagation algorithm based on semi-supervised learning (SS-LPA) is designed at the data level, which can reduce the imbalance rate of raw data to some extent after assigning high-confidence labels to unlabeled samples. At the model level, <span><math><mi>Q</mi></math></span>-network, as an effective reinforcement learning approach, is introduced into ensemble learning framework (QNEL), which can enhance the multi-classification accuracy of fluid identification by training multiple baseline models that are given different weights for feedback on imbalanced data. The experimental results from 35 tight-sand wells in Qingshimao-Gaoshawo area of Ordos Basin validate the effectiveness of the proposed framework. Specifically, the performance of CILF is the best on all three typical evaluation metrics, and it outperforms others in 12 out of a total of 18 categories. In terms of the average scores for six categories, the precision, recall rate, and F1 score of the proposed framework reach 0.988, 0.984, and 0.985, respectively.</div></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"205 ","pages":"Article 105993"},"PeriodicalIF":4.4000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel class-imbalance learning framework for fluid recognition: Application to Qingshimao-Gaoshawo tight-sand gas reservoirs in the Ordos Basin, China\",\"authors\":\"Jun Yi , ZhongLi Qi , XiangChengZhen Li , Fuqiang Lai , Wei Zhou\",\"doi\":\"10.1016/j.cageo.2025.105993\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The mathematical model-based methods used for conventional oil and gas resources often perform poorly in fluid recognition of tight-sand reservoir, due to the mutual interference of various factors such as reservoir lithology and pore structure. Booming artificial intelligence technologies and accumulating logging data provide a solid foundation for the application of machine learning methods as new tools for fluid identification. However, there is often a serious class imbalance, which can easily lead to the inability to achieve ideal classification results, in the proportion of categories of the collected well logging data. Consequently, this issue has become a huge challenge for the academic and industrial communities. To address this, a novel class-imbalance learning framework for fluid recognition (CILF) is proposed to tight-sand gas reservoirs of Qingshimao-Gaoshawo area of Ordos Basin, in China. Specifically, an improved label propagation algorithm based on semi-supervised learning (SS-LPA) is designed at the data level, which can reduce the imbalance rate of raw data to some extent after assigning high-confidence labels to unlabeled samples. At the model level, <span><math><mi>Q</mi></math></span>-network, as an effective reinforcement learning approach, is introduced into ensemble learning framework (QNEL), which can enhance the multi-classification accuracy of fluid identification by training multiple baseline models that are given different weights for feedback on imbalanced data. The experimental results from 35 tight-sand wells in Qingshimao-Gaoshawo area of Ordos Basin validate the effectiveness of the proposed framework. Specifically, the performance of CILF is the best on all three typical evaluation metrics, and it outperforms others in 12 out of a total of 18 categories. In terms of the average scores for six categories, the precision, recall rate, and F1 score of the proposed framework reach 0.988, 0.984, and 0.985, respectively.</div></div>\",\"PeriodicalId\":55221,\"journal\":{\"name\":\"Computers & Geosciences\",\"volume\":\"205 \",\"pages\":\"Article 105993\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Geosciences\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098300425001438\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300425001438","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

常规油气资源基于数学模型的方法由于储层岩性、孔隙结构等多种因素的相互干扰，在致密砂岩储层流体识别中往往表现不佳。人工智能技术的蓬勃发展和测井数据的积累为机器学习方法作为流体识别新工具的应用提供了坚实的基础。然而，所采集的测井资料的类别比例往往存在严重的类别不平衡，容易导致无法获得理想的分类结果。因此，这一问题已成为学术界和工业界面临的巨大挑战。针对这一问题，提出了一种新的流体识别类不平衡学习框架（CILF），并应用于鄂尔多斯盆地青石茂—高沙窝致密砂岩气藏。具体而言，在数据层面设计了一种改进的基于半监督学习的标签传播算法（SS-LPA），通过对未标记的样本分配高置信度标签，可以在一定程度上降低原始数据的不平衡率。在模型层面，将Q-network作为一种有效的强化学习方法引入到集成学习框架（QNEL）中，通过训练多个基线模型，赋予不同的权值对不平衡数据进行反馈，提高流体识别的多分类精度。鄂尔多斯盆地青石茂—高沙窝地区35口致密砂岩井的实验结果验证了该框架的有效性。具体来说，在所有三个典型的评估指标上，CILF的表现都是最好的，并且在总共18个类别中的12个类别中表现优于其他类别。从6个类别的平均得分来看，所提框架的准确率、召回率和F1得分分别达到0.988、0.984和0.985。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A novel class-imbalance learning framework for fluid recognition: Application to Qingshimao-Gaoshawo tight-sand gas reservoirs in the Ordos Basin, China

The mathematical model-based methods used for conventional oil and gas resources often perform poorly in fluid recognition of tight-sand reservoir, due to the mutual interference of various factors such as reservoir lithology and pore structure. Booming artificial intelligence technologies and accumulating logging data provide a solid foundation for the application of machine learning methods as new tools for fluid identification. However, there is often a serious class imbalance, which can easily lead to the inability to achieve ideal classification results, in the proportion of categories of the collected well logging data. Consequently, this issue has become a huge challenge for the academic and industrial communities. To address this, a novel class-imbalance learning framework for fluid recognition (CILF) is proposed to tight-sand gas reservoirs of Qingshimao-Gaoshawo area of Ordos Basin, in China. Specifically, an improved label propagation algorithm based on semi-supervised learning (SS-LPA) is designed at the data level, which can reduce the imbalance rate of raw data to some extent after assigning high-confidence labels to unlabeled samples. At the model level,

Q

-network, as an effective reinforcement learning approach, is introduced into ensemble learning framework (QNEL), which can enhance the multi-classification accuracy of fluid identification by training multiple baseline models that are given different weights for feedback on imbalanced data. The experimental results from 35 tight-sand wells in Qingshimao-Gaoshawo area of Ordos Basin validate the effectiveness of the proposed framework. Specifically, the performance of CILF is the best on all three typical evaluation metrics, and it outperforms others in 12 out of a total of 18 categories. In terms of the average scores for six categories, the precision, recall rate, and F1 score of the proposed framework reach 0.988, 0.984, and 0.985, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Geosciences 地学-地球科学综合

CiteScore

9.30

自引率

6.80%

发文量

164

审稿时长

3.4 months

期刊介绍： Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.