用于矿化相关地球化学异常检测的无偏无增广自监督图对比学习

IF 3.3 2区地球科学 Q1 GEOCHEMISTRY & GEOPHYSICS

Journal of Geochemical Exploration Pub Date : 2025-06-23 DOI:10.1016/j.gexplo.2025.107850

Zhaorui Yang, Yongliang Chen

{"title":"用于矿化相关地球化学异常检测的无偏无增广自监督图对比学习","authors":"Zhaorui Yang, Yongliang Chen","doi":"10.1016/j.gexplo.2025.107850","DOIUrl":null,"url":null,"abstract":"<div><div>Graph contrastive learning (GCL) provides a self-supervised learning technique, which utilizes the node (sample) features, neighborhood information of node pairs and sparse label information to build a self-supervised model for identifying mineralization-related geochemical anomalies. However, using the GCL algorithm to construct a self-supervised model needs to augment graph-structured data by changing the features or views between nodes, which can result in the loss of some key neighborhood information in the original graph-structured data. Unbiased and augmentation-free self-supervised graph contrastive learning (USAF-GCL) is a novel GCL learning technique that does not need to augment graph-structured data when building a self-supervised model. Thus, it is more reliable than the GCL algorithm in detecting high-dimensional geochemical anomalies through self-supervised node representation of graph-structured data. To show the superiority of the USAF-GCL technique in the detection of mineralization-related geochemical anomalies, a USAF-GCL model and a GCL model were built on the geochemical data set of the 1:200,000 stream sediment survey covered the Baishan area, Jilin Province, China. A comparison between the USAF-GCL model and the GCL model shows that the former is obviously superior to the latter in the detection of mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the USAF-GCL model overwhelmingly dominates the ROC curve of the GCL model, the area under the ROC curves (AUCs) of the USAF-GCL and GCL models are 0.9545 and 0.8601, respectively. The precision recall (PR) curve of the USAF-GCL model dominates that of the GCL model. The area under the PR curves (AUPRCs) of the USAF-GCL and GCL models are 0.9312 and 0.026, respectively. F1 scores of the USAF-GCL and GCL models are 0.7360 and 0.0672, respectively. These statistical results indicate that the USAF-GCL model is much better than the GCL model in identifying mineralized anomaly samples from the geochemical exploration data set. The anomaly areas detected by the USAF-GCL model occupy only 17.1 % of the entire exploration area, identifying 100 % of the known polymetallic deposits; while those identified by the GCL model occupy 25 % of the entire area, identifying only 93 % of the known polymetallic deposits. Geologically, the mineralization-related anomalies identified by the USAF-GCL model are strongly consistent with the polymetallic metallogenic characteristics in the exploration area. Therefore, the USAF-GCL technique is a reliable and powerful tool for the detection of mineralization-related geochemical anomalies.</div></div>","PeriodicalId":16336,"journal":{"name":"Journal of Geochemical Exploration","volume":"278 ","pages":"Article 107850"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unbiased and augmentation-free self-supervised graph contrastive learning for detecting mineralization-related geochemical anomalies\",\"authors\":\"Zhaorui Yang, Yongliang Chen\",\"doi\":\"10.1016/j.gexplo.2025.107850\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Graph contrastive learning (GCL) provides a self-supervised learning technique, which utilizes the node (sample) features, neighborhood information of node pairs and sparse label information to build a self-supervised model for identifying mineralization-related geochemical anomalies. However, using the GCL algorithm to construct a self-supervised model needs to augment graph-structured data by changing the features or views between nodes, which can result in the loss of some key neighborhood information in the original graph-structured data. Unbiased and augmentation-free self-supervised graph contrastive learning (USAF-GCL) is a novel GCL learning technique that does not need to augment graph-structured data when building a self-supervised model. Thus, it is more reliable than the GCL algorithm in detecting high-dimensional geochemical anomalies through self-supervised node representation of graph-structured data. To show the superiority of the USAF-GCL technique in the detection of mineralization-related geochemical anomalies, a USAF-GCL model and a GCL model were built on the geochemical data set of the 1:200,000 stream sediment survey covered the Baishan area, Jilin Province, China. A comparison between the USAF-GCL model and the GCL model shows that the former is obviously superior to the latter in the detection of mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the USAF-GCL model overwhelmingly dominates the ROC curve of the GCL model, the area under the ROC curves (AUCs) of the USAF-GCL and GCL models are 0.9545 and 0.8601, respectively. The precision recall (PR) curve of the USAF-GCL model dominates that of the GCL model. The area under the PR curves (AUPRCs) of the USAF-GCL and GCL models are 0.9312 and 0.026, respectively. F1 scores of the USAF-GCL and GCL models are 0.7360 and 0.0672, respectively. These statistical results indicate that the USAF-GCL model is much better than the GCL model in identifying mineralized anomaly samples from the geochemical exploration data set. The anomaly areas detected by the USAF-GCL model occupy only 17.1 % of the entire exploration area, identifying 100 % of the known polymetallic deposits; while those identified by the GCL model occupy 25 % of the entire area, identifying only 93 % of the known polymetallic deposits. Geologically, the mineralization-related anomalies identified by the USAF-GCL model are strongly consistent with the polymetallic metallogenic characteristics in the exploration area. Therefore, the USAF-GCL technique is a reliable and powerful tool for the detection of mineralization-related geochemical anomalies.</div></div>\",\"PeriodicalId\":16336,\"journal\":{\"name\":\"Journal of Geochemical Exploration\",\"volume\":\"278 \",\"pages\":\"Article 107850\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Geochemical Exploration\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0375674225001827\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOCHEMISTRY & GEOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geochemical Exploration","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0375674225001827","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}

引用次数: 0

摘要

图对比学习（Graph contrast learning， GCL）提供了一种自监督学习技术，利用节点（样本）特征、节点对的邻域信息和稀疏标签信息，构建矿化相关地球化学异常识别的自监督模型。然而，使用GCL算法构建自监督模型需要通过改变节点之间的特征或视图来增强图结构数据，这可能导致原始图结构数据中一些关键的邻域信息丢失。无偏无增广自监督图对比学习（USAF-GCL）是一种新的自监督图对比学习技术，它在构建自监督模型时不需要对图结构数据进行增广。因此，通过图结构数据的自监督节点表示来检测高维地球化学异常比GCL算法更可靠。为了显示USAF-GCL技术在矿化相关地球化学异常检测中的优势，以吉林白山地区1:20万水系沉积物测量地球化学数据集为基础，建立了USAF-GCL模型和GCL模型。USAF-GCL模型与GCL模型的对比表明，USAF-GCL模型在矿化相关地球化学异常探测方面明显优于GCL模型。USAF-GCL模型的受试者工作特征（ROC）曲线压倒性地主导着GCL模型的ROC曲线，USAF-GCL和GCL模型的ROC曲线下面积（aus）分别为0.9545和0.8601。USAF-GCL模型的查全率曲线优于GCL模型。USAF-GCL和GCL模型的PR曲线下面积（auprc）分别为0.9312和0.026。USAF-GCL和GCL模型的F1得分分别为0.7360和0.0672。这些统计结果表明，USAF-GCL模型识别化探数据集中矿化异常样品的效果明显优于GCL模型。USAF-GCL模型发现的异常区域仅占整个勘查区域的17.1%，发现的多金属矿床占已知矿床的100%；而GCL模式识别的多金属矿床占整个区域的25%，仅识别了93%的已知多金属矿床。地质上，USAF-GCL模型识别的成矿相关异常与探区多金属成矿特征强烈吻合。因此，USAF-GCL技术是一种可靠而有力的矿化地球化学异常探测工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Unbiased and augmentation-free self-supervised graph contrastive learning for detecting mineralization-related geochemical anomalies

Graph contrastive learning (GCL) provides a self-supervised learning technique, which utilizes the node (sample) features, neighborhood information of node pairs and sparse label information to build a self-supervised model for identifying mineralization-related geochemical anomalies. However, using the GCL algorithm to construct a self-supervised model needs to augment graph-structured data by changing the features or views between nodes, which can result in the loss of some key neighborhood information in the original graph-structured data. Unbiased and augmentation-free self-supervised graph contrastive learning (USAF-GCL) is a novel GCL learning technique that does not need to augment graph-structured data when building a self-supervised model. Thus, it is more reliable than the GCL algorithm in detecting high-dimensional geochemical anomalies through self-supervised node representation of graph-structured data. To show the superiority of the USAF-GCL technique in the detection of mineralization-related geochemical anomalies, a USAF-GCL model and a GCL model were built on the geochemical data set of the 1:200,000 stream sediment survey covered the Baishan area, Jilin Province, China. A comparison between the USAF-GCL model and the GCL model shows that the former is obviously superior to the latter in the detection of mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the USAF-GCL model overwhelmingly dominates the ROC curve of the GCL model, the area under the ROC curves (AUCs) of the USAF-GCL and GCL models are 0.9545 and 0.8601, respectively. The precision recall (PR) curve of the USAF-GCL model dominates that of the GCL model. The area under the PR curves (AUPRCs) of the USAF-GCL and GCL models are 0.9312 and 0.026, respectively. F1 scores of the USAF-GCL and GCL models are 0.7360 and 0.0672, respectively. These statistical results indicate that the USAF-GCL model is much better than the GCL model in identifying mineralized anomaly samples from the geochemical exploration data set. The anomaly areas detected by the USAF-GCL model occupy only 17.1 % of the entire exploration area, identifying 100 % of the known polymetallic deposits; while those identified by the GCL model occupy 25 % of the entire area, identifying only 93 % of the known polymetallic deposits. Geologically, the mineralization-related anomalies identified by the USAF-GCL model are strongly consistent with the polymetallic metallogenic characteristics in the exploration area. Therefore, the USAF-GCL technique is a reliable and powerful tool for the detection of mineralization-related geochemical anomalies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Geochemical Exploration 地学-地球化学与地球物理

CiteScore

7.40

自引率

7.70%

发文量

148

审稿时长

8.1 months

期刊介绍： Journal of Geochemical Exploration is mostly dedicated to publication of original studies in exploration and environmental geochemistry and related topics. Contributions considered of prevalent interest for the journal include researches based on the application of innovative methods to: define the genesis and the evolution of mineral deposits including transfer of elements in large-scale mineralized areas. analyze complex systems at the boundaries between bio-geochemistry, metal transport and mineral accumulation. evaluate effects of historical mining activities on the surface environment. trace pollutant sources and define their fate and transport models in the near-surface and surface environments involving solid, fluid and aerial matrices. assess and quantify natural and technogenic radioactivity in the environment. determine geochemical anomalies and set baseline reference values using compositional data analysis, multivariate statistics and geo-spatial analysis. assess the impacts of anthropogenic contamination on ecosystems and human health at local and regional scale to prioritize and classify risks through deterministic and stochastic approaches. Papers dedicated to the presentation of newly developed methods in analytical geochemistry to be applied in the field or in laboratory are also within the topics of interest for the journal.