Qingwei Pang , Chenglizhao Chen , Wenhao Li , Shanchen Pang
{"title":"利用测井资料进行岩性识别的多域掩码重建自监督学习","authors":"Qingwei Pang , Chenglizhao Chen , Wenhao Li , Shanchen Pang","doi":"10.1016/j.knosys.2025.113843","DOIUrl":null,"url":null,"abstract":"<div><div>Lithology identification is crucial in the fields of energy exploration and oil and gas drilling, particularly for unconventional reservoirs, wherein the complexity and high heterogeneity of rock formations pose significant challenges for prospecting and exploration. To address the dual challenges of scarcity of labeled data and low accuracy of lithology identification models, in this study, we proposed a novel dual-domain masked reconstruction self-supervised learning (MR-SSL) framework. This framework comprised two stages: self-supervised pretraining and supervised fine-tuning, to significantly improve the accuracy of lithology identification using only a small number of labeled samples. In the pretraining stage, we designed three innovative tasks: time- and frequency-domain masked reconstruction and time–frequency contrastive learning, with each supported by a specifically designed loss functions. The time- and frequency-domain masked reconstruction tasks achieved multi-dimensional feature modeling through differentiated designs: the former combined cross-depth and cross-parameter dynamic masking strategies to adaptively capture stratigraphic non-stationarity based on periodic analysis, whereas the latter synchronously learned single-parameter specificity and multi-parameter correlation through a shared-private embedding mechanism. These tasks, in conjunction with the time–frequency contrastive learning task, provided the model enhanced complementarity through cross-domain feature consistency constraints. In the supervised fine-tuning stage, the pretrained encoder was frozen, time–frequency features were integrated, and classification head was trained, further enhancing the model’s capability of lithology classification, with respect to the target geological conditions. Experimental validation demonstrated that the MR-SSL model achieved high accuracy of 98.7% and 97.07% for two different oilfield datasets, while using only 20% labeled data, surpassing the performances of conventional supervised and self-supervised methods. The proposed model presents a unique advantage: it enables the deep decoupling and complementary utilization of time–frequency features in logging data through multi-task collaboration, thereby providing an efficient low-label solution for lithology identification for unconventional reservoirs.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"323 ","pages":"Article 113843"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-domain masked reconstruction self-supervised learning for lithology identification using well-logging data\",\"authors\":\"Qingwei Pang , Chenglizhao Chen , Wenhao Li , Shanchen Pang\",\"doi\":\"10.1016/j.knosys.2025.113843\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Lithology identification is crucial in the fields of energy exploration and oil and gas drilling, particularly for unconventional reservoirs, wherein the complexity and high heterogeneity of rock formations pose significant challenges for prospecting and exploration. To address the dual challenges of scarcity of labeled data and low accuracy of lithology identification models, in this study, we proposed a novel dual-domain masked reconstruction self-supervised learning (MR-SSL) framework. This framework comprised two stages: self-supervised pretraining and supervised fine-tuning, to significantly improve the accuracy of lithology identification using only a small number of labeled samples. In the pretraining stage, we designed three innovative tasks: time- and frequency-domain masked reconstruction and time–frequency contrastive learning, with each supported by a specifically designed loss functions. The time- and frequency-domain masked reconstruction tasks achieved multi-dimensional feature modeling through differentiated designs: the former combined cross-depth and cross-parameter dynamic masking strategies to adaptively capture stratigraphic non-stationarity based on periodic analysis, whereas the latter synchronously learned single-parameter specificity and multi-parameter correlation through a shared-private embedding mechanism. These tasks, in conjunction with the time–frequency contrastive learning task, provided the model enhanced complementarity through cross-domain feature consistency constraints. In the supervised fine-tuning stage, the pretrained encoder was frozen, time–frequency features were integrated, and classification head was trained, further enhancing the model’s capability of lithology classification, with respect to the target geological conditions. Experimental validation demonstrated that the MR-SSL model achieved high accuracy of 98.7% and 97.07% for two different oilfield datasets, while using only 20% labeled data, surpassing the performances of conventional supervised and self-supervised methods. The proposed model presents a unique advantage: it enables the deep decoupling and complementary utilization of time–frequency features in logging data through multi-task collaboration, thereby providing an efficient low-label solution for lithology identification for unconventional reservoirs.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"323 \",\"pages\":\"Article 113843\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125008895\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125008895","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Multi-domain masked reconstruction self-supervised learning for lithology identification using well-logging data
Lithology identification is crucial in the fields of energy exploration and oil and gas drilling, particularly for unconventional reservoirs, wherein the complexity and high heterogeneity of rock formations pose significant challenges for prospecting and exploration. To address the dual challenges of scarcity of labeled data and low accuracy of lithology identification models, in this study, we proposed a novel dual-domain masked reconstruction self-supervised learning (MR-SSL) framework. This framework comprised two stages: self-supervised pretraining and supervised fine-tuning, to significantly improve the accuracy of lithology identification using only a small number of labeled samples. In the pretraining stage, we designed three innovative tasks: time- and frequency-domain masked reconstruction and time–frequency contrastive learning, with each supported by a specifically designed loss functions. The time- and frequency-domain masked reconstruction tasks achieved multi-dimensional feature modeling through differentiated designs: the former combined cross-depth and cross-parameter dynamic masking strategies to adaptively capture stratigraphic non-stationarity based on periodic analysis, whereas the latter synchronously learned single-parameter specificity and multi-parameter correlation through a shared-private embedding mechanism. These tasks, in conjunction with the time–frequency contrastive learning task, provided the model enhanced complementarity through cross-domain feature consistency constraints. In the supervised fine-tuning stage, the pretrained encoder was frozen, time–frequency features were integrated, and classification head was trained, further enhancing the model’s capability of lithology classification, with respect to the target geological conditions. Experimental validation demonstrated that the MR-SSL model achieved high accuracy of 98.7% and 97.07% for two different oilfield datasets, while using only 20% labeled data, surpassing the performances of conventional supervised and self-supervised methods. The proposed model presents a unique advantage: it enables the deep decoupling and complementary utilization of time–frequency features in logging data through multi-task collaboration, thereby providing an efficient low-label solution for lithology identification for unconventional reservoirs.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.