A New Method of Rockburst Prediction for Categories with Sparse Data Using Improved XGBoost Algorithm

IF 4.8 2区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Ming Tao, Qizheng Zhao, Rui Zhao, Memon Muhammad Burhan
{"title":"A New Method of Rockburst Prediction for Categories with Sparse Data Using Improved XGBoost Algorithm","authors":"Ming Tao, Qizheng Zhao, Rui Zhao, Memon Muhammad Burhan","doi":"10.1007/s11053-024-10412-7","DOIUrl":null,"url":null,"abstract":"<p>Rockburst prediction significantly affects the development and utilization of underground resources. Currently, an increasing number of artificial intelligence algorithms are being applied for rockburst prediction. However, owing to the scarcity of data for certain rockburst grades, machine learning models have struggled to accurately train and learn their characteristics, resulting in bias or overfitting. In this study, 321 worldwide cases of rockbursts were collected. Seven indices considering both rock mechanics and stress conditions were selected as input parameters for the model. To address the issue of limited data for certain rockburst grades, the Synthetic Minority Over-sampling TEchnique (SMOTE) algorithm was used for comprehensive oversampling and synthesis of the rockburst data. The theoretical rationality of this method was corroborated by the Spearman’s correlation coefficient. Additionally, the model hyperparameters were optimized using the Bayesian optimization method, and an improved eXtreme gradient boosting (XGBoost) rockburst prediction model (SM–BO–XGBoost) was established. The constructed SM–BO–XGBoost model was compared with decision tree, random forest, support vector machine, and k-nearest neighbor classification machine learning models. The results showed a significant improvement in the prediction accuracy for the None and Strong rockburst categories, which had limited data in the original rockburst dataset. To address the poor interpretability of the XGBoost model, the SHapley Additive exPlanations (SHAP) method was introduced to explain the constructed model, and to analyze the marginal contributions of different features to the model output across various rockburst grades. The SM-BO-XGBoost model was validated using field rockburst records from the Xincheng and Sanshandao gold mines. As indicated by the results, the model demonstrated favorable performance and applicability, with wide potential for predicting engineering rockbursts.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"31 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-024-10412-7","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Rockburst prediction significantly affects the development and utilization of underground resources. Currently, an increasing number of artificial intelligence algorithms are being applied for rockburst prediction. However, owing to the scarcity of data for certain rockburst grades, machine learning models have struggled to accurately train and learn their characteristics, resulting in bias or overfitting. In this study, 321 worldwide cases of rockbursts were collected. Seven indices considering both rock mechanics and stress conditions were selected as input parameters for the model. To address the issue of limited data for certain rockburst grades, the Synthetic Minority Over-sampling TEchnique (SMOTE) algorithm was used for comprehensive oversampling and synthesis of the rockburst data. The theoretical rationality of this method was corroborated by the Spearman’s correlation coefficient. Additionally, the model hyperparameters were optimized using the Bayesian optimization method, and an improved eXtreme gradient boosting (XGBoost) rockburst prediction model (SM–BO–XGBoost) was established. The constructed SM–BO–XGBoost model was compared with decision tree, random forest, support vector machine, and k-nearest neighbor classification machine learning models. The results showed a significant improvement in the prediction accuracy for the None and Strong rockburst categories, which had limited data in the original rockburst dataset. To address the poor interpretability of the XGBoost model, the SHapley Additive exPlanations (SHAP) method was introduced to explain the constructed model, and to analyze the marginal contributions of different features to the model output across various rockburst grades. The SM-BO-XGBoost model was validated using field rockburst records from the Xincheng and Sanshandao gold mines. As indicated by the results, the model demonstrated favorable performance and applicability, with wide potential for predicting engineering rockbursts.

Abstract Image

使用改进的 XGBoost 算法对稀疏数据类别进行岩爆预测的新方法
岩爆预测在很大程度上影响着地下资源的开发和利用。目前,越来越多的人工智能算法被应用于岩爆预测。然而,由于某些岩爆等级的数据稀缺,机器学习模型很难准确地训练和学习其特征,从而导致偏差或过拟合。本研究收集了全球 321 个岩爆案例。研究选取了岩石力学和应力条件两个方面的七个指标作为模型的输入参数。为了解决某些岩爆等级数据有限的问题,采用了合成少数超采样技术(SMOTE)算法对岩爆数据进行全面超采样和合成。斯皮尔曼相关系数证实了该方法的理论合理性。此外,还利用贝叶斯优化方法对模型超参数进行了优化,建立了改进的极端梯度提升(XGBoost)岩爆预测模型(SM-BO-XGBoost)。将构建的 SM-BO-XGBoost 模型与决策树、随机森林、支持向量机和 k 近邻分类机器学习模型进行了比较。结果表明,对于原始岩爆数据集中数据有限的无岩爆和强岩爆类别,预测准确率有了显著提高。针对 XGBoost 模型可解释性差的问题,引入了 SHapley Additive exPlanations(SHAP)方法来解释所构建的模型,并分析不同岩爆等级的不同特征对模型输出的边际贡献。利用新城金矿和三山岛金矿的现场岩爆记录对 SM-BO-XGBoost 模型进行了验证。结果表明,该模型具有良好的性能和适用性,在预测工程岩爆方面具有广泛的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Natural Resources Research
Natural Resources Research Environmental Science-General Environmental Science
CiteScore
11.90
自引率
11.10%
发文量
151
期刊介绍: This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信