面向异常检测的正无标记度量学习提取与矿化相关的高维地球化学异常

IF 4.8 2区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Zhaorui Yang, Yongliang Chen
{"title":"面向异常检测的正无标记度量学习提取与矿化相关的高维地球化学异常","authors":"Zhaorui Yang, Yongliang Chen","doi":"10.1007/s11053-025-10464-3","DOIUrl":null,"url":null,"abstract":"<p>In geochemical exploration, a small number of positive samples and a large number of unlabeled samples can be defined according to the geochemical exploration data and the mineral deposits (occurrences) found in the exploration area. The positive samples usually comprise multiple types of mineral deposits (occurrences) while the unlabeled samples usually comprise a large number of background samples and some unknown positive samples. Accurate recognition of unknown positive samples among a large number of unlabeled samples is a challenge in the field of exploration geochemistry. To address this challenge, the positive-unlabeled (PU) metric learning for anomaly detection (PUMAD) is developed to model positive-unlabeled geochemical exploration data to detect mineralization-related anomalies. The PUMAD is a novel PU learning algorithm that incorporates artificial neural networks with distance hashing-based filtering (DHF) and deep metric learning (DML) to establish an anomaly detection model for dataset with positive and unlabeled samples. To test the effectiveness and robustness of the PUMAD in mineralization-related geochemical anomaly identification, the Baishan area of Jilin Province (China) was chosen as the case research area, and a dataset with positive and unlabeled samples was constructed according to the stream sediment geochemical survey data from four 1:200,000 scale geological maps and spatial locations of more than 30 discovered polymetallic deposits. The PUMAD model, PU learning model and DML model were established on the constructed dataset and were used to identify the geochemical anomalies linked to known polymetallic mineralization. A comparative analysis of the three models showed that the PUMAD model performed much better than the other two models in identifying mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the PUMAD model was closer to the upper left corner of the ROC space compared to those of the PU learning model and DML model. The calculated area under the ROC curve (AUC) of the PUMAD model was 0.9626, which substantially exceeded those of the PU learning model (0.8493) and the DML model (0.7542). The geochemical anomalies linked to polymetallic mineralization recognized by the PUMAD model comprised 10.89% of the Baishan exploration area and encompass all the discovered polymetallic deposits within the area, while those recognized by the PU learning model and DML model comprised 16.87% and 25.29%, respectively, of the study area and encompassed 90% and 87%, respectively, of the discovered polymetallic deposits. The recognized mineralization-related geochemical anomalies are spatially linked to regional geological factors that controlled polymetallic mineralization in the Baishan exploration area. Therefore, it can be concluded that PUMAD is an awesome technique for detecting mineralization-related anomalies within an exploration area. It is worthwhile to further test its validity for mapping mineralization-related geochemical anomalies in different exploration areas.</p>","PeriodicalId":54284,"journal":{"name":"Natural Resources Research","volume":"87 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Anomaly Detection-Oriented Positive-Unlabeled Metric Learning for Extracting High-Dimensional Geochemical Anomalies Linked to Mineralization\",\"authors\":\"Zhaorui Yang, Yongliang Chen\",\"doi\":\"10.1007/s11053-025-10464-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In geochemical exploration, a small number of positive samples and a large number of unlabeled samples can be defined according to the geochemical exploration data and the mineral deposits (occurrences) found in the exploration area. The positive samples usually comprise multiple types of mineral deposits (occurrences) while the unlabeled samples usually comprise a large number of background samples and some unknown positive samples. Accurate recognition of unknown positive samples among a large number of unlabeled samples is a challenge in the field of exploration geochemistry. To address this challenge, the positive-unlabeled (PU) metric learning for anomaly detection (PUMAD) is developed to model positive-unlabeled geochemical exploration data to detect mineralization-related anomalies. The PUMAD is a novel PU learning algorithm that incorporates artificial neural networks with distance hashing-based filtering (DHF) and deep metric learning (DML) to establish an anomaly detection model for dataset with positive and unlabeled samples. To test the effectiveness and robustness of the PUMAD in mineralization-related geochemical anomaly identification, the Baishan area of Jilin Province (China) was chosen as the case research area, and a dataset with positive and unlabeled samples was constructed according to the stream sediment geochemical survey data from four 1:200,000 scale geological maps and spatial locations of more than 30 discovered polymetallic deposits. The PUMAD model, PU learning model and DML model were established on the constructed dataset and were used to identify the geochemical anomalies linked to known polymetallic mineralization. A comparative analysis of the three models showed that the PUMAD model performed much better than the other two models in identifying mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the PUMAD model was closer to the upper left corner of the ROC space compared to those of the PU learning model and DML model. The calculated area under the ROC curve (AUC) of the PUMAD model was 0.9626, which substantially exceeded those of the PU learning model (0.8493) and the DML model (0.7542). The geochemical anomalies linked to polymetallic mineralization recognized by the PUMAD model comprised 10.89% of the Baishan exploration area and encompass all the discovered polymetallic deposits within the area, while those recognized by the PU learning model and DML model comprised 16.87% and 25.29%, respectively, of the study area and encompassed 90% and 87%, respectively, of the discovered polymetallic deposits. The recognized mineralization-related geochemical anomalies are spatially linked to regional geological factors that controlled polymetallic mineralization in the Baishan exploration area. Therefore, it can be concluded that PUMAD is an awesome technique for detecting mineralization-related anomalies within an exploration area. It is worthwhile to further test its validity for mapping mineralization-related geochemical anomalies in different exploration areas.</p>\",\"PeriodicalId\":54284,\"journal\":{\"name\":\"Natural Resources Research\",\"volume\":\"87 1\",\"pages\":\"\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Resources Research\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1007/s11053-025-10464-3\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s11053-025-10464-3","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

在化探勘查中,根据化探资料和勘查区内发现的矿床(产状),可以确定少量阳性样品和大量未标记样品。阳性样品通常包括多种类型的矿床(矿床),而未标记的样品通常包括大量背景样品和一些未知的阳性样品。在大量未标记样品中准确识别未知阳性样品是勘探地球化学领域的一个挑战。为了解决这一挑战,开发了用于异常检测的正未标记(PU)度量学习(PUMAD)来模拟正未标记地球化学勘探数据,以检测与矿化相关的异常。PUMAD是一种新颖的PU学习算法,它将人工神经网络与基于距离哈希滤波(DHF)和深度度量学习(DML)相结合,建立了具有正标记和未标记样本的数据集的异常检测模型。为验证PUMAD在矿化地球化学异常识别中的有效性和鲁棒性,以吉林省白山地区为案例研究区,根据4张1:20万比例尺地质图的水系沉积物地球化学测量数据和已发现的30多个多金属矿床的空间定位,构建了阳性和未标记样本数据集。在构建的数据集上建立了PUMAD模型、PU学习模型和DML模型,用于识别已知多金属成矿相关的地球化学异常。3种模型的对比分析表明,PUMAD模型在矿化相关地球化学异常识别方面的效果明显优于其他2种模型。与PU学习模型和DML模型相比,PUMAD模型的受试者工作特征(ROC)曲线更靠近ROC空间的左上角。PUMAD模型的ROC曲线下计算面积(AUC)为0.9626,大大超过了PU学习模型(0.8493)和DML模型(0.7542)。PUMAD模型识别的多金属成矿地球化学异常占研究区10.89%,涵盖了该区已发现的全部多金属矿床;PU学习模型和DML模型识别的多金属成矿地球化学异常占研究区16.87%和25.29%,分别涵盖了已发现的90%和87%的多金属矿床。已识别的成矿地球化学异常在空间上与控制白山探区多金属成矿的区域地质因素相联系。因此,PUMAD是探测勘探区内矿化异常的一种很好的技术。值得进一步验证其在不同勘探区域与矿化有关的地球化学异常填图中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Anomaly Detection-Oriented Positive-Unlabeled Metric Learning for Extracting High-Dimensional Geochemical Anomalies Linked to Mineralization

In geochemical exploration, a small number of positive samples and a large number of unlabeled samples can be defined according to the geochemical exploration data and the mineral deposits (occurrences) found in the exploration area. The positive samples usually comprise multiple types of mineral deposits (occurrences) while the unlabeled samples usually comprise a large number of background samples and some unknown positive samples. Accurate recognition of unknown positive samples among a large number of unlabeled samples is a challenge in the field of exploration geochemistry. To address this challenge, the positive-unlabeled (PU) metric learning for anomaly detection (PUMAD) is developed to model positive-unlabeled geochemical exploration data to detect mineralization-related anomalies. The PUMAD is a novel PU learning algorithm that incorporates artificial neural networks with distance hashing-based filtering (DHF) and deep metric learning (DML) to establish an anomaly detection model for dataset with positive and unlabeled samples. To test the effectiveness and robustness of the PUMAD in mineralization-related geochemical anomaly identification, the Baishan area of Jilin Province (China) was chosen as the case research area, and a dataset with positive and unlabeled samples was constructed according to the stream sediment geochemical survey data from four 1:200,000 scale geological maps and spatial locations of more than 30 discovered polymetallic deposits. The PUMAD model, PU learning model and DML model were established on the constructed dataset and were used to identify the geochemical anomalies linked to known polymetallic mineralization. A comparative analysis of the three models showed that the PUMAD model performed much better than the other two models in identifying mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the PUMAD model was closer to the upper left corner of the ROC space compared to those of the PU learning model and DML model. The calculated area under the ROC curve (AUC) of the PUMAD model was 0.9626, which substantially exceeded those of the PU learning model (0.8493) and the DML model (0.7542). The geochemical anomalies linked to polymetallic mineralization recognized by the PUMAD model comprised 10.89% of the Baishan exploration area and encompass all the discovered polymetallic deposits within the area, while those recognized by the PU learning model and DML model comprised 16.87% and 25.29%, respectively, of the study area and encompassed 90% and 87%, respectively, of the discovered polymetallic deposits. The recognized mineralization-related geochemical anomalies are spatially linked to regional geological factors that controlled polymetallic mineralization in the Baishan exploration area. Therefore, it can be concluded that PUMAD is an awesome technique for detecting mineralization-related anomalies within an exploration area. It is worthwhile to further test its validity for mapping mineralization-related geochemical anomalies in different exploration areas.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Natural Resources Research
Natural Resources Research Environmental Science-General Environmental Science
CiteScore
11.90
自引率
11.10%
发文量
151
期刊介绍: This journal publishes quantitative studies of natural (mainly but not limited to mineral) resources exploration, evaluation and exploitation, including environmental and risk-related aspects. Typical articles use geoscientific data or analyses to assess, test, or compare resource-related aspects. NRR covers a wide variety of resources including minerals, coal, hydrocarbon, geothermal, water, and vegetation. Case studies are welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信