A robust soft voting ensemble of the isolation forest model, extended isolation forest model and generalized isolation forest model for multivariate geochemical anomaly recognition
{"title":"A robust soft voting ensemble of the isolation forest model, extended isolation forest model and generalized isolation forest model for multivariate geochemical anomaly recognition","authors":"Chenyi Zheng , Yongliang Chen , Xudong Du","doi":"10.1016/j.oregeorev.2025.106787","DOIUrl":null,"url":null,"abstract":"<div><div>Rapid and effective recognition of metallogenic anomalies from geochemical exploration data is the key to quickly locate potential mineral prospecting areas. Isolation forest (IF) algorithm, extended isolation forest (EIF) algorithm and generalized isolation forest (GIF) algorithm are three advanced unsupervised learning ensemble techniques that can isolate anomalies rapidly and effectively from high-dimensional data. Previous studies have shown that the three unsupervised learning ensembles have high performance and high efficiency in the recognition of multivariate geochemical anomalies. However, they suffer from lack of robustness because of the randomness in isolation tree construction, including random subsampling with replacement and random selection of splitting threshold, which causes unstable anomaly patterns in complex geochemical settings. To solve this problem, a robust soft voting ensemble (SVE) model was built from the IF model, EIF model and GIF model for the recognition of multivariate geochemical anomalies in the Molidawa area (Inner Mongolia, China). The IF model, EIF model and GIF model were built on the interpolated 1:50,000 stream sediment data and used as the base anomaly recognition models. The soft voting algorithm was then used to build the SVE model based on the three base anomaly recognition models. A comparison of the SVE model with the three base anomaly recognition models shows that the SVE model is more robust than the three base anomaly recognition models. The anomalies recognized by the SVE model contain all the known molybdenum deposits and spatially coincide with the molybdenum mineralization controlling factors such as intermediate-acidic magmatic intrusions and faults. Therefore, in geochemical anomaly recognition, soft voting algorithm is a feasible tool to build a robust anomaly recognition ensemble model from a set of base anomaly recognition models lacking robustness.</div></div>","PeriodicalId":19644,"journal":{"name":"Ore Geology Reviews","volume":"185 ","pages":"Article 106787"},"PeriodicalIF":3.2000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ore Geology Reviews","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169136825003476","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Rapid and effective recognition of metallogenic anomalies from geochemical exploration data is the key to quickly locate potential mineral prospecting areas. Isolation forest (IF) algorithm, extended isolation forest (EIF) algorithm and generalized isolation forest (GIF) algorithm are three advanced unsupervised learning ensemble techniques that can isolate anomalies rapidly and effectively from high-dimensional data. Previous studies have shown that the three unsupervised learning ensembles have high performance and high efficiency in the recognition of multivariate geochemical anomalies. However, they suffer from lack of robustness because of the randomness in isolation tree construction, including random subsampling with replacement and random selection of splitting threshold, which causes unstable anomaly patterns in complex geochemical settings. To solve this problem, a robust soft voting ensemble (SVE) model was built from the IF model, EIF model and GIF model for the recognition of multivariate geochemical anomalies in the Molidawa area (Inner Mongolia, China). The IF model, EIF model and GIF model were built on the interpolated 1:50,000 stream sediment data and used as the base anomaly recognition models. The soft voting algorithm was then used to build the SVE model based on the three base anomaly recognition models. A comparison of the SVE model with the three base anomaly recognition models shows that the SVE model is more robust than the three base anomaly recognition models. The anomalies recognized by the SVE model contain all the known molybdenum deposits and spatially coincide with the molybdenum mineralization controlling factors such as intermediate-acidic magmatic intrusions and faults. Therefore, in geochemical anomaly recognition, soft voting algorithm is a feasible tool to build a robust anomaly recognition ensemble model from a set of base anomaly recognition models lacking robustness.
期刊介绍:
Ore Geology Reviews aims to familiarize all earth scientists with recent advances in a number of interconnected disciplines related to the study of, and search for, ore deposits. The reviews range from brief to longer contributions, but the journal preferentially publishes manuscripts that fill the niche between the commonly shorter journal articles and the comprehensive book coverages, and thus has a special appeal to many authors and readers.