开发油气相关岩石力学的大型语言模型：进展与挑战

IF 6.5 3区工程技术 Q2 ENERGY & FUELS

Natural Gas Industry B Pub Date : 2025-04-01 DOI:10.1016/j.ngib.2025.03.007

Botao Lin , Yan Jin , Qianwen Cao , Han Meng , Huiwen Pang , Shiming Wei

{"title":"开发油气相关岩石力学的大型语言模型：进展与挑战","authors":"Botao Lin , Yan Jin , Qianwen Cao , Han Meng , Huiwen Pang , Shiming Wei","doi":"10.1016/j.ngib.2025.03.007","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, large language models (LLMs) have demonstrated immense potential in practical applications to enhance work efficiency and decision-making capabilities. However, specialized LLMs in the oil and gas engineering area are rarely developed. To aid in exploring and developing deep and ultra-deep unconventional reservoirs, there is a call for a personalized LLM on oil- and gas-related rock mechanics, which may handle complex professional data and make intelligent predictions and decisions. To that end, herein, we overview general and industry-specific LLMs. Then, a systematic workflow is proposed for building this domain-specific LLM for oil and gas engineering, including data collection and processing, model construction and training, model validation, and implementation in the specific domain. Moreover, three application scenarios are investigated: knowledge extraction from textural resources, field operation with multidisciplinary integration, and intelligent decision assistance. Finally, several challenges in developing this domain-specific LLM are highlighted. Our key findings are that geological surveys, laboratory experiments, field tests, and numerical simulations form the four original sources of rock mechanics data. Those data must flow through collection, storage, processing, and governance before being fed into LLM training. This domain-specific LLM can be trained by fine-tuning a general open-source LLM with professional data and constraints such as rock mechanics datasets and principles. The LLM can then follow the commonly used training and validation processes before being implemented in the oil and gas field. However, there are three primary challenges in building this domain-specific LLM: data standardization, data security and access, and striking a compromise between physics and data when building the model structure. Some of these challenges are administrative rather than technical, and overcoming those requires close collaboration between the different interested parties and various professional practitioners.</div></div>","PeriodicalId":37116,"journal":{"name":"Natural Gas Industry B","volume":"12 2","pages":"Pages 110-122"},"PeriodicalIF":6.5000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Developing a large language model for oil- and gas-related rock mechanics: Progress and challenges\",\"authors\":\"Botao Lin , Yan Jin , Qianwen Cao , Han Meng , Huiwen Pang , Shiming Wei\",\"doi\":\"10.1016/j.ngib.2025.03.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In recent years, large language models (LLMs) have demonstrated immense potential in practical applications to enhance work efficiency and decision-making capabilities. However, specialized LLMs in the oil and gas engineering area are rarely developed. To aid in exploring and developing deep and ultra-deep unconventional reservoirs, there is a call for a personalized LLM on oil- and gas-related rock mechanics, which may handle complex professional data and make intelligent predictions and decisions. To that end, herein, we overview general and industry-specific LLMs. Then, a systematic workflow is proposed for building this domain-specific LLM for oil and gas engineering, including data collection and processing, model construction and training, model validation, and implementation in the specific domain. Moreover, three application scenarios are investigated: knowledge extraction from textural resources, field operation with multidisciplinary integration, and intelligent decision assistance. Finally, several challenges in developing this domain-specific LLM are highlighted. Our key findings are that geological surveys, laboratory experiments, field tests, and numerical simulations form the four original sources of rock mechanics data. Those data must flow through collection, storage, processing, and governance before being fed into LLM training. This domain-specific LLM can be trained by fine-tuning a general open-source LLM with professional data and constraints such as rock mechanics datasets and principles. The LLM can then follow the commonly used training and validation processes before being implemented in the oil and gas field. However, there are three primary challenges in building this domain-specific LLM: data standardization, data security and access, and striking a compromise between physics and data when building the model structure. Some of these challenges are administrative rather than technical, and overcoming those requires close collaboration between the different interested parties and various professional practitioners.</div></div>\",\"PeriodicalId\":37116,\"journal\":{\"name\":\"Natural Gas Industry B\",\"volume\":\"12 2\",\"pages\":\"Pages 110-122\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Gas Industry B\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S235285402500021X\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Gas Industry B","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235285402500021X","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，大型语言模型（llm）在实际应用中显示出巨大的潜力，可以提高工作效率和决策能力。然而，油气工程领域的专业法学硕士很少。为了帮助勘探和开发深层和超深层非常规油藏，需要个性化的油气相关岩石力学法学硕士（LLM），它可以处理复杂的专业数据，并做出智能预测和决策。为此，在此，我们概述了一般和特定行业的法学硕士。然后，提出了构建油气工程领域法学硕士的系统工作流程，包括数据收集和处理、模型构建和训练、模型验证以及在特定领域的实现。研究了纹理资源知识提取、多学科集成的现场作业和智能决策辅助三种应用场景。最后，强调了开发该领域特定法学硕士的几个挑战。我们的主要发现是地质调查、实验室实验、现场测试和数值模拟形成了岩石力学数据的四个原始来源。这些数据必须经过收集、存储、处理和治理，然后才能提供给LLM培训。这个特定领域的法学硕士可以通过微调一个通用的开源法学硕士专业数据和约束，如岩石力学数据集和原理来训练。然后，LLM可以在油气田实施之前遵循常用的培训和验证流程。然而，在构建这个特定于领域的LLM时，存在三个主要挑战：数据标准化、数据安全和访问，以及在构建模型结构时在物理和数据之间达成妥协。其中一些挑战是行政方面的，而不是技术方面的，克服这些挑战需要不同有关方面和各种专业实践者之间的密切合作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Developing a large language model for oil- and gas-related rock mechanics: Progress and challenges

In recent years, large language models (LLMs) have demonstrated immense potential in practical applications to enhance work efficiency and decision-making capabilities. However, specialized LLMs in the oil and gas engineering area are rarely developed. To aid in exploring and developing deep and ultra-deep unconventional reservoirs, there is a call for a personalized LLM on oil- and gas-related rock mechanics, which may handle complex professional data and make intelligent predictions and decisions. To that end, herein, we overview general and industry-specific LLMs. Then, a systematic workflow is proposed for building this domain-specific LLM for oil and gas engineering, including data collection and processing, model construction and training, model validation, and implementation in the specific domain. Moreover, three application scenarios are investigated: knowledge extraction from textural resources, field operation with multidisciplinary integration, and intelligent decision assistance. Finally, several challenges in developing this domain-specific LLM are highlighted. Our key findings are that geological surveys, laboratory experiments, field tests, and numerical simulations form the four original sources of rock mechanics data. Those data must flow through collection, storage, processing, and governance before being fed into LLM training. This domain-specific LLM can be trained by fine-tuning a general open-source LLM with professional data and constraints such as rock mechanics datasets and principles. The LLM can then follow the commonly used training and validation processes before being implemented in the oil and gas field. However, there are three primary challenges in building this domain-specific LLM: data standardization, data security and access, and striking a compromise between physics and data when building the model structure. Some of these challenges are administrative rather than technical, and overcoming those requires close collaboration between the different interested parties and various professional practitioners.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Natural Gas Industry B Earth and Planetary Sciences-Geology

CiteScore

5.80

自引率

6.10%

发文量

审稿时长

79 days