{"title":"用于矿化相关地球化学异常检测的无偏无增广自监督图对比学习","authors":"Zhaorui Yang, Yongliang Chen","doi":"10.1016/j.gexplo.2025.107850","DOIUrl":null,"url":null,"abstract":"<div><div>Graph contrastive learning (GCL) provides a self-supervised learning technique, which utilizes the node (sample) features, neighborhood information of node pairs and sparse label information to build a self-supervised model for identifying mineralization-related geochemical anomalies. However, using the GCL algorithm to construct a self-supervised model needs to augment graph-structured data by changing the features or views between nodes, which can result in the loss of some key neighborhood information in the original graph-structured data. Unbiased and augmentation-free self-supervised graph contrastive learning (USAF-GCL) is a novel GCL learning technique that does not need to augment graph-structured data when building a self-supervised model. Thus, it is more reliable than the GCL algorithm in detecting high-dimensional geochemical anomalies through self-supervised node representation of graph-structured data. To show the superiority of the USAF-GCL technique in the detection of mineralization-related geochemical anomalies, a USAF-GCL model and a GCL model were built on the geochemical data set of the 1:200,000 stream sediment survey covered the Baishan area, Jilin Province, China. A comparison between the USAF-GCL model and the GCL model shows that the former is obviously superior to the latter in the detection of mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the USAF-GCL model overwhelmingly dominates the ROC curve of the GCL model, the area under the ROC curves (AUCs) of the USAF-GCL and GCL models are 0.9545 and 0.8601, respectively. The precision recall (PR) curve of the USAF-GCL model dominates that of the GCL model. The area under the PR curves (AUPRCs) of the USAF-GCL and GCL models are 0.9312 and 0.026, respectively. F1 scores of the USAF-GCL and GCL models are 0.7360 and 0.0672, respectively. These statistical results indicate that the USAF-GCL model is much better than the GCL model in identifying mineralized anomaly samples from the geochemical exploration data set. The anomaly areas detected by the USAF-GCL model occupy only 17.1 % of the entire exploration area, identifying 100 % of the known polymetallic deposits; while those identified by the GCL model occupy 25 % of the entire area, identifying only 93 % of the known polymetallic deposits. Geologically, the mineralization-related anomalies identified by the USAF-GCL model are strongly consistent with the polymetallic metallogenic characteristics in the exploration area. Therefore, the USAF-GCL technique is a reliable and powerful tool for the detection of mineralization-related geochemical anomalies.</div></div>","PeriodicalId":16336,"journal":{"name":"Journal of Geochemical Exploration","volume":"278 ","pages":"Article 107850"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unbiased and augmentation-free self-supervised graph contrastive learning for detecting mineralization-related geochemical anomalies\",\"authors\":\"Zhaorui Yang, Yongliang Chen\",\"doi\":\"10.1016/j.gexplo.2025.107850\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Graph contrastive learning (GCL) provides a self-supervised learning technique, which utilizes the node (sample) features, neighborhood information of node pairs and sparse label information to build a self-supervised model for identifying mineralization-related geochemical anomalies. However, using the GCL algorithm to construct a self-supervised model needs to augment graph-structured data by changing the features or views between nodes, which can result in the loss of some key neighborhood information in the original graph-structured data. Unbiased and augmentation-free self-supervised graph contrastive learning (USAF-GCL) is a novel GCL learning technique that does not need to augment graph-structured data when building a self-supervised model. Thus, it is more reliable than the GCL algorithm in detecting high-dimensional geochemical anomalies through self-supervised node representation of graph-structured data. To show the superiority of the USAF-GCL technique in the detection of mineralization-related geochemical anomalies, a USAF-GCL model and a GCL model were built on the geochemical data set of the 1:200,000 stream sediment survey covered the Baishan area, Jilin Province, China. A comparison between the USAF-GCL model and the GCL model shows that the former is obviously superior to the latter in the detection of mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the USAF-GCL model overwhelmingly dominates the ROC curve of the GCL model, the area under the ROC curves (AUCs) of the USAF-GCL and GCL models are 0.9545 and 0.8601, respectively. The precision recall (PR) curve of the USAF-GCL model dominates that of the GCL model. The area under the PR curves (AUPRCs) of the USAF-GCL and GCL models are 0.9312 and 0.026, respectively. F1 scores of the USAF-GCL and GCL models are 0.7360 and 0.0672, respectively. These statistical results indicate that the USAF-GCL model is much better than the GCL model in identifying mineralized anomaly samples from the geochemical exploration data set. The anomaly areas detected by the USAF-GCL model occupy only 17.1 % of the entire exploration area, identifying 100 % of the known polymetallic deposits; while those identified by the GCL model occupy 25 % of the entire area, identifying only 93 % of the known polymetallic deposits. Geologically, the mineralization-related anomalies identified by the USAF-GCL model are strongly consistent with the polymetallic metallogenic characteristics in the exploration area. Therefore, the USAF-GCL technique is a reliable and powerful tool for the detection of mineralization-related geochemical anomalies.</div></div>\",\"PeriodicalId\":16336,\"journal\":{\"name\":\"Journal of Geochemical Exploration\",\"volume\":\"278 \",\"pages\":\"Article 107850\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Geochemical Exploration\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0375674225001827\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOCHEMISTRY & GEOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geochemical Exploration","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0375674225001827","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
Unbiased and augmentation-free self-supervised graph contrastive learning for detecting mineralization-related geochemical anomalies
Graph contrastive learning (GCL) provides a self-supervised learning technique, which utilizes the node (sample) features, neighborhood information of node pairs and sparse label information to build a self-supervised model for identifying mineralization-related geochemical anomalies. However, using the GCL algorithm to construct a self-supervised model needs to augment graph-structured data by changing the features or views between nodes, which can result in the loss of some key neighborhood information in the original graph-structured data. Unbiased and augmentation-free self-supervised graph contrastive learning (USAF-GCL) is a novel GCL learning technique that does not need to augment graph-structured data when building a self-supervised model. Thus, it is more reliable than the GCL algorithm in detecting high-dimensional geochemical anomalies through self-supervised node representation of graph-structured data. To show the superiority of the USAF-GCL technique in the detection of mineralization-related geochemical anomalies, a USAF-GCL model and a GCL model were built on the geochemical data set of the 1:200,000 stream sediment survey covered the Baishan area, Jilin Province, China. A comparison between the USAF-GCL model and the GCL model shows that the former is obviously superior to the latter in the detection of mineralization-related geochemical anomalies. The receiver operating characteristic (ROC) curve of the USAF-GCL model overwhelmingly dominates the ROC curve of the GCL model, the area under the ROC curves (AUCs) of the USAF-GCL and GCL models are 0.9545 and 0.8601, respectively. The precision recall (PR) curve of the USAF-GCL model dominates that of the GCL model. The area under the PR curves (AUPRCs) of the USAF-GCL and GCL models are 0.9312 and 0.026, respectively. F1 scores of the USAF-GCL and GCL models are 0.7360 and 0.0672, respectively. These statistical results indicate that the USAF-GCL model is much better than the GCL model in identifying mineralized anomaly samples from the geochemical exploration data set. The anomaly areas detected by the USAF-GCL model occupy only 17.1 % of the entire exploration area, identifying 100 % of the known polymetallic deposits; while those identified by the GCL model occupy 25 % of the entire area, identifying only 93 % of the known polymetallic deposits. Geologically, the mineralization-related anomalies identified by the USAF-GCL model are strongly consistent with the polymetallic metallogenic characteristics in the exploration area. Therefore, the USAF-GCL technique is a reliable and powerful tool for the detection of mineralization-related geochemical anomalies.
期刊介绍:
Journal of Geochemical Exploration is mostly dedicated to publication of original studies in exploration and environmental geochemistry and related topics.
Contributions considered of prevalent interest for the journal include researches based on the application of innovative methods to:
define the genesis and the evolution of mineral deposits including transfer of elements in large-scale mineralized areas.
analyze complex systems at the boundaries between bio-geochemistry, metal transport and mineral accumulation.
evaluate effects of historical mining activities on the surface environment.
trace pollutant sources and define their fate and transport models in the near-surface and surface environments involving solid, fluid and aerial matrices.
assess and quantify natural and technogenic radioactivity in the environment.
determine geochemical anomalies and set baseline reference values using compositional data analysis, multivariate statistics and geo-spatial analysis.
assess the impacts of anthropogenic contamination on ecosystems and human health at local and regional scale to prioritize and classify risks through deterministic and stochastic approaches.
Papers dedicated to the presentation of newly developed methods in analytical geochemistry to be applied in the field or in laboratory are also within the topics of interest for the journal.