A comprehensive study of auto-encoders for anomaly detection: Efficiency and trade-offs

Machine learning with applications Pub Date : 2024-07-09 DOI:10.1016/j.mlwa.2024.100572

Asif Ahmed Neloy , Maxime Turgeon

{"title":"A comprehensive study of auto-encoders for anomaly detection: Efficiency and trade-offs","authors":"Asif Ahmed Neloy , Maxime Turgeon","doi":"10.1016/j.mlwa.2024.100572","DOIUrl":null,"url":null,"abstract":"<div><p>Unsupervised anomaly detection (UAD) is a diverse research area explored across various application domains. Over time, numerous anomaly detection techniques, including clustering, generative, and variational inference-based methods, are developed to address specific drawbacks and advance state-of-the-art techniques. Deep learning and generative models recently played a significant role in identifying unique challenges and devising advanced approaches. Auto-encoders (AEs) represent one such powerful technique that combines generative and probabilistic variational modeling with deep architecture. Auto-Encoder aims to learn the underlying data distribution to generate consequential sample data. This concept of data generation and the adoption of generative modeling have emerged in extensive research and variations in Auto-Encoder design, particularly in unsupervised representation learning. This study systematically reviews 11 Auto-Encoder architectures categorized into three groups, aiming to differentiate their reconstruction ability, sample generation, latent space visualization, and accuracy in classifying anomalous data using the Fashion-MNIST (FMNIST) and MNIST datasets. Additionally, we closely observed the reproducibility scope under different training parameters. We conducted reproducibility experiments utilizing similar model setups and hyperparameters and attempted to generate comparative results to address the scope of improvements for each Auto-Encoder. We conclude this study by analyzing the experimental results, which guide us in identifying the efficiency and trade-offs among auto-encoders, providing valuable insights into their performance and applicability in unsupervised anomaly detection techniques.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100572"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000483/pdfft?md5=deffaabf165a48bed93f11897aaeeb38&pid=1-s2.0-S2666827024000483-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000483","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Unsupervised anomaly detection (UAD) is a diverse research area explored across various application domains. Over time, numerous anomaly detection techniques, including clustering, generative, and variational inference-based methods, are developed to address specific drawbacks and advance state-of-the-art techniques. Deep learning and generative models recently played a significant role in identifying unique challenges and devising advanced approaches. Auto-encoders (AEs) represent one such powerful technique that combines generative and probabilistic variational modeling with deep architecture. Auto-Encoder aims to learn the underlying data distribution to generate consequential sample data. This concept of data generation and the adoption of generative modeling have emerged in extensive research and variations in Auto-Encoder design, particularly in unsupervised representation learning. This study systematically reviews 11 Auto-Encoder architectures categorized into three groups, aiming to differentiate their reconstruction ability, sample generation, latent space visualization, and accuracy in classifying anomalous data using the Fashion-MNIST (FMNIST) and MNIST datasets. Additionally, we closely observed the reproducibility scope under different training parameters. We conducted reproducibility experiments utilizing similar model setups and hyperparameters and attempted to generate comparative results to address the scope of improvements for each Auto-Encoder. We conclude this study by analyzing the experimental results, which guide us in identifying the efficiency and trade-offs among auto-encoders, providing valuable insights into their performance and applicability in unsupervised anomaly detection techniques.

查看原文本刊更多论文

异常检测自动编码器综合研究：效率与权衡

无监督异常检测（UAD）是一个跨越不同应用领域的多样化研究领域。随着时间的推移，许多异常检测技术（包括聚类、生成和基于变异推理的方法）被开发出来，以解决特定的缺陷并推动最先进技术的发展。最近，深度学习和生成模型在识别独特挑战和设计先进方法方面发挥了重要作用。自动编码器（AE）就是这样一种将生成模型和概率变分模型与深度架构相结合的强大技术。自动编码器旨在学习底层数据分布，从而生成相应的样本数据。在自动编码器设计的广泛研究和变化中，特别是在无监督表示学习中，出现了数据生成和采用生成建模的概念。本研究系统地回顾了分为三组的 11 种自动编码器架构，旨在利用时尚-MNIST（FMNIST）和 MNIST 数据集区分它们的重构能力、样本生成、潜在空间可视化以及异常数据分类的准确性。此外，我们还密切观察了不同训练参数下的重现性范围。我们利用类似的模型设置和超参数进行了再现性实验，并尝试生成比较结果，以确定每个自动编码器的改进范围。最后，我们对实验结果进行了分析，这些结果指导我们确定了自动编码器的效率和权衡，为我们了解自动编码器在无监督异常检测技术中的性能和适用性提供了宝贵的意见。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications

自引率

0.00%

发文量

审稿时长

98 days