Anomaly Detection-Oriented Positive-Unlabeled Metric Learning for Extracting High-Dimensional Geochemical Anomalies Linked to Mineralization

IF 4.8 2区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Zhaorui Yang, Yongliang Chen
{"title":"Anomaly Detection-Oriented Positive-Unlabeled Metric Learning for Extracting High-Dimensional Geochemical Anomalies Linked to Mineralization","authors":"Zhaorui Yang, Yongliang Chen","doi":"10.1007/s11053-025-10464-3","DOIUrl":null,"url":null,"abstract":"<p>In geochemical exploration, a small number of positive samples and a large number of unlabeled samples can be defined according to the geochemical exploration data and the mineral deposits (occurrences) found in the exploration area. The positive samples usually comprise multiple types of mineral deposits (occurrences) while the unlabeled samples usually comprise a large number of background samples and some unknown positive samples. Accurate recognition of unknown positive samples among a large number of unlabeled samples is a challenge in the field of exploration geochemistry. To address this challenge, the positive-unlabeled (PU) metric learning for anomaly detection (PUMAD) is developed to model positive-unlabeled geochemical exploration data to detect mineralization-related anomalies. The PUMAD is a novel PU learning algorithm that incorporates artificial neural networks with distance hashing-based filtering (DHF) and deep metric learning (DML) to establish an anomaly detection model for dataset with positive and unlabeled samples. To test the effectiveness and robustness of the PUMAD in mineralization-related geochemical anomaly identification, the Baishan area of Jilin Province (China) was chosen as the case research area, and a dataset with positive and unlabeled samples was constructed according to the stream sediment geochemical survey data from four 1:200,000 scale geological maps and spatial locations of more than 30 discovered polymetallic deposits. The PUMAD model, PU learning model and DML model were established on the constructed dataset and were used to identify the geochemical anomalies linked to known polymetallic mineralization. A comparative analysis of the three models showed that the PUMAD model performed much better than the other two models in identifying mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the PUMAD model was closer to the upper left corner of the ROC space compared to those of the PU learning model and DML model. The calculated area under the ROC curve (AUC) of the PUMAD model was 0.9626, which substantially exceeded those of the PU learning model (0.8493) and the DML model (0.7542). The geochemical anomalies linked to polymetallic mineralization recognized by the PUMAD model comprised 10.89% of the Baishan exploration area and encompass all the discovered polymetallic deposits within the area, while those recognized by the PU learning model and DML model comprised 16.87% and 25.29%, respectively, of the study area and encompassed 90% and 87%, respectively, of the discovered polymetallic deposits. The recognized mineralization-related geochemical anomalies are spatially linked to regional geological factors that controlled polymetallic mineralization in the Baishan exploration area. Therefore, it can be concluded that PUMAD is an awesome technique for detecting mineralization-related anomalies within an exploration area. It is worthwhile to further test its validity for mapping mineralization-related geochemical anomalies in different exploration areas.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"87 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-025-10464-3","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

In geochemical exploration, a small number of positive samples and a large number of unlabeled samples can be defined according to the geochemical exploration data and the mineral deposits (occurrences) found in the exploration area. The positive samples usually comprise multiple types of mineral deposits (occurrences) while the unlabeled samples usually comprise a large number of background samples and some unknown positive samples. Accurate recognition of unknown positive samples among a large number of unlabeled samples is a challenge in the field of exploration geochemistry. To address this challenge, the positive-unlabeled (PU) metric learning for anomaly detection (PUMAD) is developed to model positive-unlabeled geochemical exploration data to detect mineralization-related anomalies. The PUMAD is a novel PU learning algorithm that incorporates artificial neural networks with distance hashing-based filtering (DHF) and deep metric learning (DML) to establish an anomaly detection model for dataset with positive and unlabeled samples. To test the effectiveness and robustness of the PUMAD in mineralization-related geochemical anomaly identification, the Baishan area of Jilin Province (China) was chosen as the case research area, and a dataset with positive and unlabeled samples was constructed according to the stream sediment geochemical survey data from four 1:200,000 scale geological maps and spatial locations of more than 30 discovered polymetallic deposits. The PUMAD model, PU learning model and DML model were established on the constructed dataset and were used to identify the geochemical anomalies linked to known polymetallic mineralization. A comparative analysis of the three models showed that the PUMAD model performed much better than the other two models in identifying mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the PUMAD model was closer to the upper left corner of the ROC space compared to those of the PU learning model and DML model. The calculated area under the ROC curve (AUC) of the PUMAD model was 0.9626, which substantially exceeded those of the PU learning model (0.8493) and the DML model (0.7542). The geochemical anomalies linked to polymetallic mineralization recognized by the PUMAD model comprised 10.89% of the Baishan exploration area and encompass all the discovered polymetallic deposits within the area, while those recognized by the PU learning model and DML model comprised 16.87% and 25.29%, respectively, of the study area and encompassed 90% and 87%, respectively, of the discovered polymetallic deposits. The recognized mineralization-related geochemical anomalies are spatially linked to regional geological factors that controlled polymetallic mineralization in the Baishan exploration area. Therefore, it can be concluded that PUMAD is an awesome technique for detecting mineralization-related anomalies within an exploration area. It is worthwhile to further test its validity for mapping mineralization-related geochemical anomalies in different exploration areas.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Natural Resources Research
Natural Resources Research Environmental Science-General Environmental Science
CiteScore
11.90
自引率
11.10%
发文量
151
期刊介绍: This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信