Stacking ensemble learning algorithm based rapid inverse modelling of copper grade using imaging spectral data

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-12-12 DOI:10.1016/j.chemolab.2024.105308

Jingli Wang , Jingxiang Gao

{"title":"Stacking ensemble learning algorithm based rapid inverse modelling of copper grade using imaging spectral data","authors":"Jingli Wang , Jingxiang Gao","doi":"10.1016/j.chemolab.2024.105308","DOIUrl":null,"url":null,"abstract":"<div><div>The determination of copper ore grade in a reasonably fast and accurate manner is of great practical significance for the purposes of ore dressing and ore allocation in mines. The most common method of determining the grade of copper ore is chemical analysis. However, this method has several disadvantages, including a lengthy determination period, the possibility of chemical pollution, and a lag in the results of ore dressing and ore allocation. Hyperspectral imaging technology is capable of both spectral resolution and image resolution. It is able to obtain the indicators of the sample to be measured while retaining its original physical and chemical properties. This makes it possible to overcome the shortcomings of traditional methods, allowing for accurate, non-destructive, environmentally friendly, rapid detection of samples. Stacking can often provide higher predictive accuracy than a single model by combining the predictions of multiple models, and has the advantages of reduced overfitting, model diversity, flexibility and adaptability. Stacking ensemble learning algorithm is rarely used for hyperspectral quantitative inversion modelling. In this study, 138 copper samples from the Mirador Copper Mine were employed as a data source. The spectral data of the copper samples and chemical analyses of the copper grades were collected utilising a Pika L with a Pika NIR-320 hyperspectral imager. Firstly, the raw spectral data were subjected to mutual information computation as a means of serial fusion of the spectral data, and the fused data were subjected to SG smoothing to remove noise from the spectral experiments. Subsequently, the pre-processed spectral data were subjected to feature band extraction utilising the CARS and CARS-SPA algorithms with the objective of eliminating uninformative variables and extracting valid spectral information. Finally, based on the Stacking algorithm, a highly reliable copper grade estimation model was constructed by combining various machine learning methods, and transfer learning was used to verify the accuracy and generalisation of the model. The findings of the study indicate that the feature bands selected by CARS-SPA encompass spectral ranges with sufficient chemical information, while uninformative variables are largely excluded, resulting in a notable increase in the speed and accuracy of modelling inversion operations. The Stacking ensemble learning model is more suitable for the prediction of copper grade in the Mirador copper mine compared to a single inversion model, and the CARS-SPA-Stacking inversion model has the highest accuracy, with R<sup>2</sup>, RMSE, MAE, RPD, MAPE and CV reaching 0.936, 0.040, 0.019, 4.018, 0.059 and 0.267, respectively. This study is pertinent to the application of fused imaging spectral data in conjunction with the Stacking ensemble learning algorithm to copper grade inversion at the Mirador copper mine.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105308"},"PeriodicalIF":3.7000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016974392400248X","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The determination of copper ore grade in a reasonably fast and accurate manner is of great practical significance for the purposes of ore dressing and ore allocation in mines. The most common method of determining the grade of copper ore is chemical analysis. However, this method has several disadvantages, including a lengthy determination period, the possibility of chemical pollution, and a lag in the results of ore dressing and ore allocation. Hyperspectral imaging technology is capable of both spectral resolution and image resolution. It is able to obtain the indicators of the sample to be measured while retaining its original physical and chemical properties. This makes it possible to overcome the shortcomings of traditional methods, allowing for accurate, non-destructive, environmentally friendly, rapid detection of samples. Stacking can often provide higher predictive accuracy than a single model by combining the predictions of multiple models, and has the advantages of reduced overfitting, model diversity, flexibility and adaptability. Stacking ensemble learning algorithm is rarely used for hyperspectral quantitative inversion modelling. In this study, 138 copper samples from the Mirador Copper Mine were employed as a data source. The spectral data of the copper samples and chemical analyses of the copper grades were collected utilising a Pika L with a Pika NIR-320 hyperspectral imager. Firstly, the raw spectral data were subjected to mutual information computation as a means of serial fusion of the spectral data, and the fused data were subjected to SG smoothing to remove noise from the spectral experiments. Subsequently, the pre-processed spectral data were subjected to feature band extraction utilising the CARS and CARS-SPA algorithms with the objective of eliminating uninformative variables and extracting valid spectral information. Finally, based on the Stacking algorithm, a highly reliable copper grade estimation model was constructed by combining various machine learning methods, and transfer learning was used to verify the accuracy and generalisation of the model. The findings of the study indicate that the feature bands selected by CARS-SPA encompass spectral ranges with sufficient chemical information, while uninformative variables are largely excluded, resulting in a notable increase in the speed and accuracy of modelling inversion operations. The Stacking ensemble learning model is more suitable for the prediction of copper grade in the Mirador copper mine compared to a single inversion model, and the CARS-SPA-Stacking inversion model has the highest accuracy, with R², RMSE, MAE, RPD, MAPE and CV reaching 0.936, 0.040, 0.019, 4.018, 0.059 and 0.267, respectively. This study is pertinent to the application of fused imaging spectral data in conjunction with the Stacking ensemble learning algorithm to copper grade inversion at the Mirador copper mine.

查看原文本刊更多论文

基于叠加集成学习算法的铜品位成像光谱快速反演模型

合理、快速、准确地测定铜矿石品位，对矿山选矿配矿具有重要的现实意义。测定铜矿品位最常用的方法是化学分析。然而，这种方法有几个缺点，包括测定周期长，化学污染的可能性，以及选矿和配矿结果的滞后。高光谱成像技术具有光谱分辨率和图像分辨率兼备的特点。它能够在保持被测样品原有物理和化学性质的同时，获得被测样品的各项指标。这使得克服传统方法的缺点成为可能，从而实现准确、无损、环保、快速的样品检测。叠加往往可以通过组合多个模型的预测，提供比单一模型更高的预测精度，并且具有减少过拟合、模型多样性、灵活性和适应性等优点。叠层集成学习算法很少用于高光谱定量反演建模。本研究以Mirador copper Mine的138个铜样品为数据源。利用皮卡L和皮卡NIR-320高光谱成像仪收集铜样品的光谱数据和铜品位的化学分析。首先，对原始光谱数据进行互信息计算，作为光谱数据串行融合的手段，对融合后的数据进行SG平滑处理，去除光谱实验中的噪声；随后，利用CARS和CARS- spa算法对预处理后的光谱数据进行特征波段提取，消除无信息变量，提取有效光谱信息。最后，在Stacking算法的基础上，结合多种机器学习方法构建了高可靠的铜品位估计模型，并利用迁移学习验证了模型的准确性和泛化性。研究结果表明，CARS-SPA选择的特征波段涵盖了具有足够化学信息的光谱范围，而大量排除了无信息变量，从而显著提高了建模反演操作的速度和精度。与单一反演模型相比，Stacking集成学习模型更适合Mirador铜矿铜品位的预测，其中CARS-SPA-Stacking反演模型精度最高，R2、RMSE、MAE、RPD、MAPE和CV分别达到0.936、0.040、0.019、4.018、0.059和0.267。本文研究了融合成像光谱数据结合叠加集成学习算法在Mirador铜矿铜品位反演中的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Chemometrics and Intelligent Laboratory Systems 工程技术-分析化学

CiteScore

7.50

自引率

7.70%

发文量

169

审稿时长

3.4 months

期刊介绍： Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.