A Tiny Transformer-Based Anomaly Detection Framework for IoT Solutions

IF 2.9 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Luca Barbieri;Mattia Brambilla;Mario Stefanutti;Ciro Romano;Niccolò De Carlo;Manuel Roveri
{"title":"A Tiny Transformer-Based Anomaly Detection Framework for IoT Solutions","authors":"Luca Barbieri;Mattia Brambilla;Mario Stefanutti;Ciro Romano;Niccolò De Carlo;Manuel Roveri","doi":"10.1109/OJSP.2023.3333756","DOIUrl":null,"url":null,"abstract":"The widespread proliferation of Internet of Things (IoT) devices has pushed for the development of novel transformer-based Anomaly Detection (AD) tools for an accurate monitoring of functionalities in industrial systems. Despite their outstanding performances, transformer models often rely on large Neural Networks (NNs) that are difficult to be executed by IoT devices due to their energy/computing constraints. This paper focuses on introducing tiny transformer-based AD tools to make them viable solutions for on-device AD. Starting from the state-of-the-art Anomaly Transformer (AT) model, which has been shown to provide accurate AD functionalities but it is characterized by high computational and memory demand, we propose a tiny AD framework that finds an optimized configuration of the AT model and uses it for devising a compressed version compatible with resource-constrained IoT systems. A knowledge distillation tool is developed to obtain a highly compressed AT model without degrading the AD performance. The proposed framework is firstly analyzed on four widely-adopted AD datasets and then assessed using data extracted from a real-world monitoring facility. The results show that the tiny AD tool provides a compressed AT model with a staggering 99.93% reduction in the number of trainable parameters compared to the original implementation (from 4.8 million to 3300 or 1400 according to the input dataset), without significantly compromising the accuracy in AD. Moreover, the compressed model substantially outperforms a popular Recurrent Neural Network (RNN)-based AD tool having a similar number of trainable weights as well as a conventional One-Class Support Vector Machine (OCSVM) algorithm.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"4 ","pages":"462-478"},"PeriodicalIF":2.9000,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10319782","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of signal processing","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10319782/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

The widespread proliferation of Internet of Things (IoT) devices has pushed for the development of novel transformer-based Anomaly Detection (AD) tools for an accurate monitoring of functionalities in industrial systems. Despite their outstanding performances, transformer models often rely on large Neural Networks (NNs) that are difficult to be executed by IoT devices due to their energy/computing constraints. This paper focuses on introducing tiny transformer-based AD tools to make them viable solutions for on-device AD. Starting from the state-of-the-art Anomaly Transformer (AT) model, which has been shown to provide accurate AD functionalities but it is characterized by high computational and memory demand, we propose a tiny AD framework that finds an optimized configuration of the AT model and uses it for devising a compressed version compatible with resource-constrained IoT systems. A knowledge distillation tool is developed to obtain a highly compressed AT model without degrading the AD performance. The proposed framework is firstly analyzed on four widely-adopted AD datasets and then assessed using data extracted from a real-world monitoring facility. The results show that the tiny AD tool provides a compressed AT model with a staggering 99.93% reduction in the number of trainable parameters compared to the original implementation (from 4.8 million to 3300 or 1400 according to the input dataset), without significantly compromising the accuracy in AD. Moreover, the compressed model substantially outperforms a popular Recurrent Neural Network (RNN)-based AD tool having a similar number of trainable weights as well as a conventional One-Class Support Vector Machine (OCSVM) algorithm.
基于微型变压器的物联网解决方案异常检测框架
物联网(IoT)设备的广泛普及推动了新型基于变压器的异常检测(AD)工具的发展,用于准确监测工业系统的功能。尽管变压器模型具有出色的性能,但它们通常依赖于大型神经网络(nn),由于其能量/计算限制,这些神经网络难以由物联网设备执行。本文重点介绍了基于微型变压器的AD工具,使其成为器件上AD的可行解决方案。从最先进的异常变压器(AT)模型开始,该模型已被证明可以提供准确的AD功能,但其特点是具有高计算和内存需求,我们提出了一个微小的AD框架,该框架可以找到AT模型的优化配置,并将其用于设计与资源受限的物联网系统兼容的压缩版本。为了在不降低AD性能的前提下获得高度压缩的AT模型,开发了一种知识蒸馏工具。首先在四个广泛采用的AD数据集上分析了所提出的框架,然后使用从现实世界监测设施中提取的数据进行了评估。结果表明,与原始实现相比,微小的AD工具提供了一个压缩的AT模型,可训练参数的数量减少了惊人的99.93%(根据输入数据集,从480万减少到3300或1400),而AD的准确性没有显著降低。此外,压缩模型大大优于流行的基于循环神经网络(RNN)的AD工具,具有相似数量的可训练权重,以及传统的一类支持向量机(OCSVM)算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
0.00%
发文量
0
审稿时长
22 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信