Intelligent coalbed methane drainage optimization: A deep reinforcement learning-driven life-cycle strategy

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Energy and AI Pub Date : 2025-08-27 DOI:10.1016/j.egyai.2025.100598

Chen Liu , Bin Gong , HaoQiang Wu , Hu Huang , Heng Zhao

{"title":"Intelligent coalbed methane drainage optimization: A deep reinforcement learning-driven life-cycle strategy","authors":"Chen Liu , Bin Gong , HaoQiang Wu , Hu Huang , Heng Zhao","doi":"10.1016/j.egyai.2025.100598","DOIUrl":null,"url":null,"abstract":"<div><div>Coalbed methane (CBM) production, as a significant portion of unconventional natural gas development, plays a crucial role in enhancing output and economic benefits through the optimization of its life-cycle drainage strategy. Traditional drainage strategies rely on experience and trial-and-error methods, making it difficult to adapt to complex and dynamic production environments. This study proposes a life-cycle production and drainage optimization strategy for CBM based on Deep Reinforcement Learning (DRL). Utilizing the Deep Q-Network (DQN) algorithm, this work learns and optimizes the drainage strategy during the production process, achieving intelligent decision-making for drainage operations. An auto-regressive surrogate model is introduced to build a reinforcement learning environment for gas production and drainage, based on a deep learning model (CNN-LSTM). This model substitutes the full-physics simulation model that requires high computational cost, significantly accelerating the interactive learning process between the agent and the environment in DRL. Whether to set the gas production or Net Present Value (NPV) as reward, two strategies for reinforcement learning were considered accordingly. The results concluded that the DRL drainage strategy with NPV as the reward increased the net gain by 5.83 % compared to historical data. Compared with traditional methods, this approach significantly improves the NPV and optimizes the drainage strategy. The findings demonstrate that the life-cycle drainage optimization method for CBM based on DRL is not only efficient and feasible but also provides an intelligent solution for the development of unconventional natural gas resources. The results highlight the method's strong adaptability and potential for addressing complex optimization problems in dynamic production environments.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"22 ","pages":"Article 100598"},"PeriodicalIF":9.6000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825001302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Coalbed methane (CBM) production, as a significant portion of unconventional natural gas development, plays a crucial role in enhancing output and economic benefits through the optimization of its life-cycle drainage strategy. Traditional drainage strategies rely on experience and trial-and-error methods, making it difficult to adapt to complex and dynamic production environments. This study proposes a life-cycle production and drainage optimization strategy for CBM based on Deep Reinforcement Learning (DRL). Utilizing the Deep Q-Network (DQN) algorithm, this work learns and optimizes the drainage strategy during the production process, achieving intelligent decision-making for drainage operations. An auto-regressive surrogate model is introduced to build a reinforcement learning environment for gas production and drainage, based on a deep learning model (CNN-LSTM). This model substitutes the full-physics simulation model that requires high computational cost, significantly accelerating the interactive learning process between the agent and the environment in DRL. Whether to set the gas production or Net Present Value (NPV) as reward, two strategies for reinforcement learning were considered accordingly. The results concluded that the DRL drainage strategy with NPV as the reward increased the net gain by 5.83 % compared to historical data. Compared with traditional methods, this approach significantly improves the NPV and optimizes the drainage strategy. The findings demonstrate that the life-cycle drainage optimization method for CBM based on DRL is not only efficient and feasible but also provides an intelligent solution for the development of unconventional natural gas resources. The results highlight the method's strong adaptability and potential for addressing complex optimization problems in dynamic production environments.

查看原文本刊更多论文

智能煤层气抽放优化：深度强化学习驱动的生命周期策略

煤层气生产作为非常规天然气开发的重要组成部分，通过优化煤层气全生命周期抽采策略，对提高煤层气产量和经济效益起着至关重要的作用。传统的排水策略依赖于经验和试错方法，难以适应复杂和动态的生产环境。本研究提出了一种基于深度强化学习（DRL）的煤层气全生命周期产排水优化策略。利用Deep Q-Network （DQN）算法，学习并优化生产过程中的排水策略，实现排水作业的智能决策。在深度学习模型（CNN-LSTM）的基础上，引入了自回归代理模型来构建天然气生产和排水的强化学习环境。该模型替代了计算成本较高的全物理仿真模型，显著加快了DRL中agent与环境之间的交互学习过程。以产气量或净现值（NPV）作为奖励，分别考虑了两种强化学习策略。结果表明，与历史数据相比，以净现值为奖励的DRL排水策略使净收益增加了5.83%。与传统方法相比，该方法显著提高了NPV，优化了排水策略。研究结果表明，基于DRL的煤层气全生命周期排水优化方法不仅高效可行，而且为非常规天然气资源开发提供了一种智能化解决方案。结果表明，该方法具有较强的适应性和解决动态生产环境中复杂优化问题的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊