{"title":"利用加拿大不列颠哥伦比亚省西南部QUEST南流沉积物地球化学数据的先进数据分析和机器学习进行矿产资源预测","authors":"E. Grunsky, D. Arne","doi":"10.1144/geochem2020-054","DOIUrl":null,"url":null,"abstract":"In this study we apply multivariate statistical and predictive classification methods to interpret geochemical data from 8545 stream-sediment samples collected in southern British Columbia, Canada. Data for 35 elements were corrected for laboratory bias and adjusted for values reported below the lower limit of detection. Each sample site was attributed with the closest British Columbia MINFILE occurrence within 2.5 km. MINFILE occurrences were grouped into ‘GroupModels’ based on similarities between the British Columbia Geological Survey mineral deposit models and geochemical signatures. These data were used to create a training dataset of 474 observations, including 100 samples not attributed with a MINFILE occurrence. The training set was used to generate predictions for the mineral deposit models from which posterior probabilities were estimated for the remaining 8071 samples. The data underwent a centred log-ratio transformation and then characterization using either principal component analysis (PCA) or t-distributed stochastic neighbour embedding using 9 dimensions (t-SNE) prior to classification by random forests. The posterior probabilities generated from the t-SNE metric provide a slightly higher level of prediction accuracy compared to the posterior probabilities obtained using the PCA metric. The results are comparable to those obtained using a conventional catchment analysis approach and expert-driven model. The approach presented here provides a repeatable, consistent and defensible methodology for the identification of prospective mineralized terrains and mineral systems.","PeriodicalId":55114,"journal":{"name":"Geochemistry-Exploration Environment Analysis","volume":" ","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2020-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1144/geochem2020-054","citationCount":"15","resultStr":"{\"title\":\"Mineral-resource prediction using advanced data analytics and machine learning of the QUEST-South stream-sediment geochemical data, southwestern British Columbia, Canada\",\"authors\":\"E. Grunsky, D. Arne\",\"doi\":\"10.1144/geochem2020-054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study we apply multivariate statistical and predictive classification methods to interpret geochemical data from 8545 stream-sediment samples collected in southern British Columbia, Canada. Data for 35 elements were corrected for laboratory bias and adjusted for values reported below the lower limit of detection. Each sample site was attributed with the closest British Columbia MINFILE occurrence within 2.5 km. MINFILE occurrences were grouped into ‘GroupModels’ based on similarities between the British Columbia Geological Survey mineral deposit models and geochemical signatures. These data were used to create a training dataset of 474 observations, including 100 samples not attributed with a MINFILE occurrence. The training set was used to generate predictions for the mineral deposit models from which posterior probabilities were estimated for the remaining 8071 samples. The data underwent a centred log-ratio transformation and then characterization using either principal component analysis (PCA) or t-distributed stochastic neighbour embedding using 9 dimensions (t-SNE) prior to classification by random forests. The posterior probabilities generated from the t-SNE metric provide a slightly higher level of prediction accuracy compared to the posterior probabilities obtained using the PCA metric. The results are comparable to those obtained using a conventional catchment analysis approach and expert-driven model. The approach presented here provides a repeatable, consistent and defensible methodology for the identification of prospective mineralized terrains and mineral systems.\",\"PeriodicalId\":55114,\"journal\":{\"name\":\"Geochemistry-Exploration Environment Analysis\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2020-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1144/geochem2020-054\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Geochemistry-Exploration Environment Analysis\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1144/geochem2020-054\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GEOCHEMISTRY & GEOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geochemistry-Exploration Environment Analysis","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1144/geochem2020-054","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
Mineral-resource prediction using advanced data analytics and machine learning of the QUEST-South stream-sediment geochemical data, southwestern British Columbia, Canada
In this study we apply multivariate statistical and predictive classification methods to interpret geochemical data from 8545 stream-sediment samples collected in southern British Columbia, Canada. Data for 35 elements were corrected for laboratory bias and adjusted for values reported below the lower limit of detection. Each sample site was attributed with the closest British Columbia MINFILE occurrence within 2.5 km. MINFILE occurrences were grouped into ‘GroupModels’ based on similarities between the British Columbia Geological Survey mineral deposit models and geochemical signatures. These data were used to create a training dataset of 474 observations, including 100 samples not attributed with a MINFILE occurrence. The training set was used to generate predictions for the mineral deposit models from which posterior probabilities were estimated for the remaining 8071 samples. The data underwent a centred log-ratio transformation and then characterization using either principal component analysis (PCA) or t-distributed stochastic neighbour embedding using 9 dimensions (t-SNE) prior to classification by random forests. The posterior probabilities generated from the t-SNE metric provide a slightly higher level of prediction accuracy compared to the posterior probabilities obtained using the PCA metric. The results are comparable to those obtained using a conventional catchment analysis approach and expert-driven model. The approach presented here provides a repeatable, consistent and defensible methodology for the identification of prospective mineralized terrains and mineral systems.
期刊介绍:
Geochemistry: Exploration, Environment, Analysis (GEEA) is a co-owned journal of the Geological Society of London and the Association of Applied Geochemists (AAG).
GEEA focuses on mineral exploration using geochemistry; related fields also covered include geoanalysis, the development of methods and techniques used to analyse geochemical materials such as rocks, soils, sediments, waters and vegetation, and environmental issues associated with mining and source apportionment.
GEEA is well-known for its thematic sets on hot topics and regularly publishes papers from the biennial International Applied Geochemistry Symposium (IAGS).
Papers that seek to integrate geological, geochemical and geophysical methods of exploration are particularly welcome, as are those that concern geochemical mapping and those that comprise case histories. Given the many links between exploration and environmental geochemistry, the journal encourages the exchange of concepts and data; in particular, to differentiate various sources of elements.
GEEA publishes research articles; discussion papers; book reviews; editorial content and thematic sets.