{"title":"Geo-Hgan: Unsupervised anomaly detection in geochemical data via latent space learning","authors":"","doi":"10.1016/j.cageo.2024.105703","DOIUrl":null,"url":null,"abstract":"<div><p>Reconstructing geochemical data for anomaly detection using Generative Adversarial Networks (GANs) has become a prevalent method in identifying geochemical anomalies. However, injecting random noise into GANs can induce model instability. To mitigate this issue, we propose a novel anomaly detection model, Geo-Hgan, which integrates a dual adversarial network architecture with a Latent Space Adversarial Module (LSAM) to learn the distribution of latent variables from arbitrary data and optimize the sample reconstruction process, thereby alleviating instability during GAN training. Additionally, an encoder guided by the LSAM-pretrained GAN is employed to extract variational features, facilitating rapid and effective sample mapping into the latent space defined by LSAM. Experimental results demonstrate that under unsupervised conditions, Geo-Hgan achieves an Area Under the Curve (AUC) score of 85% across three geochemical datasets, outperforming similar models in accuracy and reconstruction capabilities. To assess its versatility and generalization ability, we extend Geo-Hgan to anomaly detection tasks in computer vision, where it achieves an average AUC score of 98.7% on the MvtecAD dataset, setting a new state-of-the-art performance in the domain. Furthermore, we propose AnomFilter, a method for setting anomaly thresholds based on the clustering hypothesis. AnomFilter identifies high-confidence anomaly samples identified by Geo-Hgan in the source domain and iteratively transfers them to the target domain. These high-confidence anomaly samples, combined with a small number of known positive samples in the target domain, enhance the accuracy of supervised geochemical anomaly detection in the target domain, which achieved an AUC score of 94%. The utilization of anomaly detection models for sample transfer learning offers a novel perspective for future work.</p></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":null,"pages":null},"PeriodicalIF":4.2000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300424001869","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Reconstructing geochemical data for anomaly detection using Generative Adversarial Networks (GANs) has become a prevalent method in identifying geochemical anomalies. However, injecting random noise into GANs can induce model instability. To mitigate this issue, we propose a novel anomaly detection model, Geo-Hgan, which integrates a dual adversarial network architecture with a Latent Space Adversarial Module (LSAM) to learn the distribution of latent variables from arbitrary data and optimize the sample reconstruction process, thereby alleviating instability during GAN training. Additionally, an encoder guided by the LSAM-pretrained GAN is employed to extract variational features, facilitating rapid and effective sample mapping into the latent space defined by LSAM. Experimental results demonstrate that under unsupervised conditions, Geo-Hgan achieves an Area Under the Curve (AUC) score of 85% across three geochemical datasets, outperforming similar models in accuracy and reconstruction capabilities. To assess its versatility and generalization ability, we extend Geo-Hgan to anomaly detection tasks in computer vision, where it achieves an average AUC score of 98.7% on the MvtecAD dataset, setting a new state-of-the-art performance in the domain. Furthermore, we propose AnomFilter, a method for setting anomaly thresholds based on the clustering hypothesis. AnomFilter identifies high-confidence anomaly samples identified by Geo-Hgan in the source domain and iteratively transfers them to the target domain. These high-confidence anomaly samples, combined with a small number of known positive samples in the target domain, enhance the accuracy of supervised geochemical anomaly detection in the target domain, which achieved an AUC score of 94%. The utilization of anomaly detection models for sample transfer learning offers a novel perspective for future work.
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.