A novel machine learning approach for reservoir temperature prediction

IF 3.5 2区 工程技术 Q3 ENERGY & FUELS
Haoxin Shi , Yanjun Zhang , Yuxiang Cheng , Jixiang Guo , Jianqiao Zheng , Xin Zhang , Yude Lei , Yongjie Ma , Lin Bai
{"title":"A novel machine learning approach for reservoir temperature prediction","authors":"Haoxin Shi ,&nbsp;Yanjun Zhang ,&nbsp;Yuxiang Cheng ,&nbsp;Jixiang Guo ,&nbsp;Jianqiao Zheng ,&nbsp;Xin Zhang ,&nbsp;Yude Lei ,&nbsp;Yongjie Ma ,&nbsp;Lin Bai","doi":"10.1016/j.geothermics.2024.103204","DOIUrl":null,"url":null,"abstract":"<div><div>Accurately assessing geothermal potential is a significant global challenge, and the development of reservoir temperature prediction models is a key aspect of evaluating this potential. Machine learning modeling serves as an effective tool in this process. However, before modeling, the inability to fully screen complex and nonlinear input features, combined with the insufficiency of datasets, often impacts the predictive accuracy of the models. This study collected hydrochemical test data from 65 groundwater samples in the Guide area of Qinghai Province from 2009 to 2016. To address the issue of missing data, we employed the LRTC-TNN method to supplement the dataset. Subsequently, we conducted correlation analysis on the data features using normalization and Pearson correlation coefficients to identify important features. Based on the processed dataset, we constructed XGBoost and LightGBM models and used 5-fold cross-validation and Bayesian optimization model to select the optimal combination of model parameters. In the modeling analysis, we explored the advantages and disadvantages of both models and evaluated their performance in terms of accuracy, robustness, and generalization capability. The results indicate that the model performs best when 80% of the training data is used. The LRTC-TNN model effectively fills in missing data, achieving an accuracy exceeding 95%. When applying the XGBoost and LightGBM models to the training set, test set, and complete dataset, the XGBoost model consistently yielded significant predictive results, specifically an R² value of 98.09%, a RMSE of 0.546, and a MAE of 0.396. Robustness analysis showed that the XGBoost model is more robust, while feature importance and sensitivity analysis revealed that chloride ions are the key independent variable affecting reservoir temperature predictions. Furthermore, generalization capability validation indicated that the model can adapt well to different datasets and provide accurate predictive results. In conclusion, the XGBoost model, which considers complementary data, demonstrates excellent generality in reservoir temperature prediction, providing a reliable solution for accurately determining underground reservoir temperatures.</div></div>","PeriodicalId":55095,"journal":{"name":"Geothermics","volume":"125 ","pages":"Article 103204"},"PeriodicalIF":3.5000,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geothermics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0375650524002906","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0

Abstract

Accurately assessing geothermal potential is a significant global challenge, and the development of reservoir temperature prediction models is a key aspect of evaluating this potential. Machine learning modeling serves as an effective tool in this process. However, before modeling, the inability to fully screen complex and nonlinear input features, combined with the insufficiency of datasets, often impacts the predictive accuracy of the models. This study collected hydrochemical test data from 65 groundwater samples in the Guide area of Qinghai Province from 2009 to 2016. To address the issue of missing data, we employed the LRTC-TNN method to supplement the dataset. Subsequently, we conducted correlation analysis on the data features using normalization and Pearson correlation coefficients to identify important features. Based on the processed dataset, we constructed XGBoost and LightGBM models and used 5-fold cross-validation and Bayesian optimization model to select the optimal combination of model parameters. In the modeling analysis, we explored the advantages and disadvantages of both models and evaluated their performance in terms of accuracy, robustness, and generalization capability. The results indicate that the model performs best when 80% of the training data is used. The LRTC-TNN model effectively fills in missing data, achieving an accuracy exceeding 95%. When applying the XGBoost and LightGBM models to the training set, test set, and complete dataset, the XGBoost model consistently yielded significant predictive results, specifically an R² value of 98.09%, a RMSE of 0.546, and a MAE of 0.396. Robustness analysis showed that the XGBoost model is more robust, while feature importance and sensitivity analysis revealed that chloride ions are the key independent variable affecting reservoir temperature predictions. Furthermore, generalization capability validation indicated that the model can adapt well to different datasets and provide accurate predictive results. In conclusion, the XGBoost model, which considers complementary data, demonstrates excellent generality in reservoir temperature prediction, providing a reliable solution for accurately determining underground reservoir temperatures.
水库温度预测的新型机器学习方法
准确评估地热潜力是一项重大的全球性挑战,而开发储层温度预测模型则是评估这一潜力的关键环节。机器学习建模是这一过程中的有效工具。然而,在建模之前,由于无法充分筛选复杂的非线性输入特征,再加上数据集不足,往往会影响模型的预测准确性。本研究收集了 2009 年至 2016 年青海省贵德地区 65 个地下水样本的水化学测试数据。为了解决数据缺失的问题,我们采用了 LRTC-TNN 方法来补充数据集。随后,我们利用归一化和皮尔逊相关系数对数据特征进行了相关分析,以确定重要特征。根据处理后的数据集,我们构建了 XGBoost 和 LightGBM 模型,并使用 5 倍交叉验证和贝叶斯优化模型来选择最佳的模型参数组合。在建模分析中,我们探讨了两种模型的优缺点,并从准确性、鲁棒性和泛化能力等方面评估了它们的性能。结果表明,当使用 80% 的训练数据时,模型表现最佳。LRTC-TNN 模型能有效填补缺失数据,准确率超过 95%。在对训练集、测试集和完整数据集应用 XGBoost 和 LightGBM 模型时,XGBoost 模型始终能产生显著的预测结果,具体而言,R² 值为 98.09%,RMSE 为 0.546,MAE 为 0.396。稳健性分析表明,XGBoost 模型更加稳健,而特征重要性和敏感性分析表明,氯离子是影响储层温度预测的关键自变量。此外,泛化能力验证表明,该模型能够很好地适应不同的数据集,并提供准确的预测结果。总之,考虑了互补数据的 XGBoost 模型在储层温度预测方面表现出卓越的通用性,为准确确定地下储层温度提供了可靠的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Geothermics
Geothermics 工程技术-地球科学综合
CiteScore
7.70
自引率
15.40%
发文量
237
审稿时长
4.5 months
期刊介绍: Geothermics is an international journal devoted to the research and development of geothermal energy. The International Board of Editors of Geothermics, which comprises specialists in the various aspects of geothermal resources, exploration and development, guarantees the balanced, comprehensive view of scientific and technological developments in this promising energy field. It promulgates the state of the art and science of geothermal energy, its exploration and exploitation through a regular exchange of information from all parts of the world. The journal publishes articles dealing with the theory, exploration techniques and all aspects of the utilization of geothermal resources. Geothermics serves as the scientific house, or exchange medium, through which the growing community of geothermal specialists can provide and receive information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信