Chen Liu , Bin Gong , HaoQiang Wu , Hu Huang , Heng Zhao
{"title":"Intelligent coalbed methane drainage optimization: A deep reinforcement learning-driven life-cycle strategy","authors":"Chen Liu , Bin Gong , HaoQiang Wu , Hu Huang , Heng Zhao","doi":"10.1016/j.egyai.2025.100598","DOIUrl":null,"url":null,"abstract":"<div><div>Coalbed methane (CBM) production, as a significant portion of unconventional natural gas development, plays a crucial role in enhancing output and economic benefits through the optimization of its life-cycle drainage strategy. Traditional drainage strategies rely on experience and trial-and-error methods, making it difficult to adapt to complex and dynamic production environments. This study proposes a life-cycle production and drainage optimization strategy for CBM based on Deep Reinforcement Learning (DRL). Utilizing the Deep Q-Network (DQN) algorithm, this work learns and optimizes the drainage strategy during the production process, achieving intelligent decision-making for drainage operations. An auto-regressive surrogate model is introduced to build a reinforcement learning environment for gas production and drainage, based on a deep learning model (CNN-LSTM). This model substitutes the full-physics simulation model that requires high computational cost, significantly accelerating the interactive learning process between the agent and the environment in DRL. Whether to set the gas production or Net Present Value (NPV) as reward, two strategies for reinforcement learning were considered accordingly. The results concluded that the DRL drainage strategy with NPV as the reward increased the net gain by 5.83 % compared to historical data. Compared with traditional methods, this approach significantly improves the NPV and optimizes the drainage strategy. The findings demonstrate that the life-cycle drainage optimization method for CBM based on DRL is not only efficient and feasible but also provides an intelligent solution for the development of unconventional natural gas resources. The results highlight the method's strong adaptability and potential for addressing complex optimization problems in dynamic production environments.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"22 ","pages":"Article 100598"},"PeriodicalIF":9.6000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825001302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Coalbed methane (CBM) production, as a significant portion of unconventional natural gas development, plays a crucial role in enhancing output and economic benefits through the optimization of its life-cycle drainage strategy. Traditional drainage strategies rely on experience and trial-and-error methods, making it difficult to adapt to complex and dynamic production environments. This study proposes a life-cycle production and drainage optimization strategy for CBM based on Deep Reinforcement Learning (DRL). Utilizing the Deep Q-Network (DQN) algorithm, this work learns and optimizes the drainage strategy during the production process, achieving intelligent decision-making for drainage operations. An auto-regressive surrogate model is introduced to build a reinforcement learning environment for gas production and drainage, based on a deep learning model (CNN-LSTM). This model substitutes the full-physics simulation model that requires high computational cost, significantly accelerating the interactive learning process between the agent and the environment in DRL. Whether to set the gas production or Net Present Value (NPV) as reward, two strategies for reinforcement learning were considered accordingly. The results concluded that the DRL drainage strategy with NPV as the reward increased the net gain by 5.83 % compared to historical data. Compared with traditional methods, this approach significantly improves the NPV and optimizes the drainage strategy. The findings demonstrate that the life-cycle drainage optimization method for CBM based on DRL is not only efficient and feasible but also provides an intelligent solution for the development of unconventional natural gas resources. The results highlight the method's strong adaptability and potential for addressing complex optimization problems in dynamic production environments.