Jun Yi , ZhongLi Qi , XiangChengZhen Li , Fuqiang Lai , Wei Zhou
{"title":"一种新的类不平衡学习框架在流体识别中的应用——以鄂尔多斯盆地青石茂—高沙窝致密砂岩气藏为例","authors":"Jun Yi , ZhongLi Qi , XiangChengZhen Li , Fuqiang Lai , Wei Zhou","doi":"10.1016/j.cageo.2025.105993","DOIUrl":null,"url":null,"abstract":"<div><div>The mathematical model-based methods used for conventional oil and gas resources often perform poorly in fluid recognition of tight-sand reservoir, due to the mutual interference of various factors such as reservoir lithology and pore structure. Booming artificial intelligence technologies and accumulating logging data provide a solid foundation for the application of machine learning methods as new tools for fluid identification. However, there is often a serious class imbalance, which can easily lead to the inability to achieve ideal classification results, in the proportion of categories of the collected well logging data. Consequently, this issue has become a huge challenge for the academic and industrial communities. To address this, a novel class-imbalance learning framework for fluid recognition (CILF) is proposed to tight-sand gas reservoirs of Qingshimao-Gaoshawo area of Ordos Basin, in China. Specifically, an improved label propagation algorithm based on semi-supervised learning (SS-LPA) is designed at the data level, which can reduce the imbalance rate of raw data to some extent after assigning high-confidence labels to unlabeled samples. At the model level, <span><math><mi>Q</mi></math></span>-network, as an effective reinforcement learning approach, is introduced into ensemble learning framework (QNEL), which can enhance the multi-classification accuracy of fluid identification by training multiple baseline models that are given different weights for feedback on imbalanced data. The experimental results from 35 tight-sand wells in Qingshimao-Gaoshawo area of Ordos Basin validate the effectiveness of the proposed framework. Specifically, the performance of CILF is the best on all three typical evaluation metrics, and it outperforms others in 12 out of a total of 18 categories. In terms of the average scores for six categories, the precision, recall rate, and F1 score of the proposed framework reach 0.988, 0.984, and 0.985, respectively.</div></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"205 ","pages":"Article 105993"},"PeriodicalIF":4.4000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel class-imbalance learning framework for fluid recognition: Application to Qingshimao-Gaoshawo tight-sand gas reservoirs in the Ordos Basin, China\",\"authors\":\"Jun Yi , ZhongLi Qi , XiangChengZhen Li , Fuqiang Lai , Wei Zhou\",\"doi\":\"10.1016/j.cageo.2025.105993\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The mathematical model-based methods used for conventional oil and gas resources often perform poorly in fluid recognition of tight-sand reservoir, due to the mutual interference of various factors such as reservoir lithology and pore structure. Booming artificial intelligence technologies and accumulating logging data provide a solid foundation for the application of machine learning methods as new tools for fluid identification. However, there is often a serious class imbalance, which can easily lead to the inability to achieve ideal classification results, in the proportion of categories of the collected well logging data. Consequently, this issue has become a huge challenge for the academic and industrial communities. To address this, a novel class-imbalance learning framework for fluid recognition (CILF) is proposed to tight-sand gas reservoirs of Qingshimao-Gaoshawo area of Ordos Basin, in China. Specifically, an improved label propagation algorithm based on semi-supervised learning (SS-LPA) is designed at the data level, which can reduce the imbalance rate of raw data to some extent after assigning high-confidence labels to unlabeled samples. At the model level, <span><math><mi>Q</mi></math></span>-network, as an effective reinforcement learning approach, is introduced into ensemble learning framework (QNEL), which can enhance the multi-classification accuracy of fluid identification by training multiple baseline models that are given different weights for feedback on imbalanced data. The experimental results from 35 tight-sand wells in Qingshimao-Gaoshawo area of Ordos Basin validate the effectiveness of the proposed framework. Specifically, the performance of CILF is the best on all three typical evaluation metrics, and it outperforms others in 12 out of a total of 18 categories. In terms of the average scores for six categories, the precision, recall rate, and F1 score of the proposed framework reach 0.988, 0.984, and 0.985, respectively.</div></div>\",\"PeriodicalId\":55221,\"journal\":{\"name\":\"Computers & Geosciences\",\"volume\":\"205 \",\"pages\":\"Article 105993\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Geosciences\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098300425001438\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300425001438","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
A novel class-imbalance learning framework for fluid recognition: Application to Qingshimao-Gaoshawo tight-sand gas reservoirs in the Ordos Basin, China
The mathematical model-based methods used for conventional oil and gas resources often perform poorly in fluid recognition of tight-sand reservoir, due to the mutual interference of various factors such as reservoir lithology and pore structure. Booming artificial intelligence technologies and accumulating logging data provide a solid foundation for the application of machine learning methods as new tools for fluid identification. However, there is often a serious class imbalance, which can easily lead to the inability to achieve ideal classification results, in the proportion of categories of the collected well logging data. Consequently, this issue has become a huge challenge for the academic and industrial communities. To address this, a novel class-imbalance learning framework for fluid recognition (CILF) is proposed to tight-sand gas reservoirs of Qingshimao-Gaoshawo area of Ordos Basin, in China. Specifically, an improved label propagation algorithm based on semi-supervised learning (SS-LPA) is designed at the data level, which can reduce the imbalance rate of raw data to some extent after assigning high-confidence labels to unlabeled samples. At the model level, -network, as an effective reinforcement learning approach, is introduced into ensemble learning framework (QNEL), which can enhance the multi-classification accuracy of fluid identification by training multiple baseline models that are given different weights for feedback on imbalanced data. The experimental results from 35 tight-sand wells in Qingshimao-Gaoshawo area of Ordos Basin validate the effectiveness of the proposed framework. Specifically, the performance of CILF is the best on all three typical evaluation metrics, and it outperforms others in 12 out of a total of 18 categories. In terms of the average scores for six categories, the precision, recall rate, and F1 score of the proposed framework reach 0.988, 0.984, and 0.985, respectively.
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.