Coincident learning for unsupervised anomaly detection of scientific instruments

IF 4.6 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology Pub Date : 2024-08-04 DOI:10.1088/2632-2153/ad64a6

Ryan Humble, Zhe Zhang, Finn O’Shea, Eric Darve and Daniel Ratner

{"title":"Coincident learning for unsupervised anomaly detection of scientific instruments","authors":"Ryan Humble, Zhe Zhang, Finn O’Shea, Eric Darve and Daniel Ratner","doi":"10.1088/2632-2153/ad64a6","DOIUrl":null,"url":null,"abstract":"Anomaly detection is an important task for complex scientific experiments and other complex systems (e.g. industrial facilities, manufacturing), where failures in a sub-system can lead to lost data, poor performance, or even damage to components. While scientific facilities generate a wealth of data, labeled anomalies may be rare (or even nonexistent), and expensive to acquire. Unsupervised approaches are therefore common and typically search for anomalies either by distance or density of examples in the input feature space (or some associated low-dimensional representation). This paper presents a novel approach called coincident learning for anomaly detection (CoAD), which is specifically designed for multi-modal tasks and identifies anomalies based on coincident behavior across two different slices of the feature space. We define an unsupervised metric, , out of analogy to the supervised classification Fβ statistic. CoAD uses to train an anomaly detection algorithm on unlabeled data, based on the expectation that anomalous behavior in one feature slice is coincident with anomalous behavior in the other. The method is illustrated using a synthetic outlier data set and a MNIST-based image data set, and is compared to prior state-of-the-art on two real-world tasks: a metal milling data set and our motivating task of identifying RF station anomalies in a particle accelerator.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"76 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad64a6","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Anomaly detection is an important task for complex scientific experiments and other complex systems (e.g. industrial facilities, manufacturing), where failures in a sub-system can lead to lost data, poor performance, or even damage to components. While scientific facilities generate a wealth of data, labeled anomalies may be rare (or even nonexistent), and expensive to acquire. Unsupervised approaches are therefore common and typically search for anomalies either by distance or density of examples in the input feature space (or some associated low-dimensional representation). This paper presents a novel approach called coincident learning for anomaly detection (CoAD), which is specifically designed for multi-modal tasks and identifies anomalies based on coincident behavior across two different slices of the feature space. We define an unsupervised metric, , out of analogy to the supervised classification Fβ statistic. CoAD uses to train an anomaly detection algorithm on unlabeled data, based on the expectation that anomalous behavior in one feature slice is coincident with anomalous behavior in the other. The method is illustrated using a synthetic outlier data set and a MNIST-based image data set, and is compared to prior state-of-the-art on two real-world tasks: a metal milling data set and our motivating task of identifying RF station anomalies in a particle accelerator.

查看原文本刊更多论文

用于科学仪器无监督异常检测的巧合学习

异常检测是复杂科学实验和其他复杂系统（如工业设施、制造业）的一项重要任务，其中子系统的故障可能导致数据丢失、性能低下，甚至损坏部件。虽然科学设施会产生大量数据，但标注的异常情况可能很少（甚至不存在），而且获取成本高昂。因此，无监督方法很常见，通常是通过输入特征空间（或一些相关的低维表示）中示例的距离或密度来搜索异常。本文提出了一种名为 "异常检测重合学习"（CoAD）的新方法，该方法专为多模态任务而设计，可根据特征空间两个不同片段的重合行为识别异常。我们定义了一个无监督度量，与监督分类 Fβ 统计量类似。CoAD 用于在无标记数据上训练异常检测算法，该算法基于一个特征片中的异常行为与另一个特征片中的异常行为重合的预期。我们使用合成离群点数据集和基于 MNIST 的图像数据集对该方法进行了说明，并在两个实际任务中将该方法与先前的先进方法进行了比较：一个是金属铣削数据集，另一个是我们在粒子加速器中识别射频站异常的激励任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.