Dan Liu , Shisheng Zhong , Lin Lin , Minghang Zhao , Xuyun Fu , Xueyun Liu
{"title":"Deep attention SMOTE: Data augmentation with a learnable interpolation factor for imbalanced anomaly detection of gas turbines","authors":"Dan Liu , Shisheng Zhong , Lin Lin , Minghang Zhao , Xuyun Fu , Xueyun Liu","doi":"10.1016/j.compind.2023.103972","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>Anomaly detection of </span>gas turbines<span> faces the significant challenges of data imbalance and inter-class overlap. In this paper, we develop a novel data augmentation method, namely deep attention synthetic minority over-sampling technique with the Encoder-Decoder (DA-SMOTE-ED), which serves as a key step in our hybrid re-sampling scheme. To reduce the risk of generating noise data, on one hand, the DA-SMOTE-ED leverages an Encoder-Decoder to learn a class-separable feature space to weaken the effect of inter-class overlap. On the other hand, an attention module is applied to assign proper interpolation factors to generate synthetic samples that stay off the aggregation area of normal samples. Moreover, synthetic samples are generated in the learnable feature space, mapped back to the original space, and merged with under-sampled samples to form the balanced dataset. Finally, the superiority of the developed method is validated through two case studies including the real monitoring data of gas turbines and the modified version of the commercial modular aero-propulsion system simulation (C-MAPPS) dataset. More specifically, its average balanced accuracy is 91.77 % on the gas turbine dataset, yielding 3.67 %, 6.4 %, and 5.56 % improvements compared to the SMOTE-ENN, </span></span>TimeGAN, and AugmentTS, respectively.</p></div>","PeriodicalId":55219,"journal":{"name":"Computers in Industry","volume":null,"pages":null},"PeriodicalIF":8.2000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Industry","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0166361523001227","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 5
Abstract
Anomaly detection of gas turbines faces the significant challenges of data imbalance and inter-class overlap. In this paper, we develop a novel data augmentation method, namely deep attention synthetic minority over-sampling technique with the Encoder-Decoder (DA-SMOTE-ED), which serves as a key step in our hybrid re-sampling scheme. To reduce the risk of generating noise data, on one hand, the DA-SMOTE-ED leverages an Encoder-Decoder to learn a class-separable feature space to weaken the effect of inter-class overlap. On the other hand, an attention module is applied to assign proper interpolation factors to generate synthetic samples that stay off the aggregation area of normal samples. Moreover, synthetic samples are generated in the learnable feature space, mapped back to the original space, and merged with under-sampled samples to form the balanced dataset. Finally, the superiority of the developed method is validated through two case studies including the real monitoring data of gas turbines and the modified version of the commercial modular aero-propulsion system simulation (C-MAPPS) dataset. More specifically, its average balanced accuracy is 91.77 % on the gas turbine dataset, yielding 3.67 %, 6.4 %, and 5.56 % improvements compared to the SMOTE-ENN, TimeGAN, and AugmentTS, respectively.
期刊介绍:
The objective of Computers in Industry is to present original, high-quality, application-oriented research papers that:
• Illuminate emerging trends and possibilities in the utilization of Information and Communication Technology in industry;
• Establish connections or integrations across various technology domains within the expansive realm of computer applications for industry;
• Foster connections or integrations across diverse application areas of ICT in industry.