在深度学习中整合自我注意机制：新型双头集合变压器及其在轴承故障诊断中的应用

IF 3.4 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing Pub Date : 2024-08-30 DOI:10.1016/j.sigpro.2024.109683

Qing Snyder , Qingtang Jiang , Erin Tripp

{"title":"在深度学习中整合自我注意机制：新型双头集合变压器及其在轴承故障诊断中的应用","authors":"Qing Snyder , Qingtang Jiang , Erin Tripp","doi":"10.1016/j.sigpro.2024.109683","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we propose a novel dual-head ensemble Transformer (DHET) algorithm for the classification of signals with time–frequency features such as bearing vibration signals. The DHET model employs a dual-input time–frequency architecture, integrating a 1D Transformer model and a 2D Vision Transformer model to capture the spatial and time–frequency features. By utilizing data from both the time and time–frequency domains, the proposed algorithm broadens its feature extraction capabilities and enhances the model’s capacity for generalization. In our DHET structure, the original Transformer model leverages self-attention mechanisms to consider relationships among signal input segmentations, which makes it effective at capturing long-range dependencies in signal data, while the Vision Transformer model takes 2D images as input and creates the image patches for embedding and each patch is linearly embedded into a flat vector and treated as a ‘token,’ then the ‘tokens’ are processed by the Transformer layers to learn global contextual representations, enabling the model to perform signal classification task. This integration notably enhances the performance and capability of the model. Our DHET is especially effective for rolling bearing fault diagnosis. The simulation results show that the proposed DHET has higher classification accuracy for bearing fault diagnosis and outperforms CNN-based methods.</p></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"227 ","pages":"Article 109683"},"PeriodicalIF":3.4000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating self-attention mechanisms in deep learning: A novel dual-head ensemble transformer with its application to bearing fault diagnosis\",\"authors\":\"Qing Snyder , Qingtang Jiang , Erin Tripp\",\"doi\":\"10.1016/j.sigpro.2024.109683\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this paper, we propose a novel dual-head ensemble Transformer (DHET) algorithm for the classification of signals with time–frequency features such as bearing vibration signals. The DHET model employs a dual-input time–frequency architecture, integrating a 1D Transformer model and a 2D Vision Transformer model to capture the spatial and time–frequency features. By utilizing data from both the time and time–frequency domains, the proposed algorithm broadens its feature extraction capabilities and enhances the model’s capacity for generalization. In our DHET structure, the original Transformer model leverages self-attention mechanisms to consider relationships among signal input segmentations, which makes it effective at capturing long-range dependencies in signal data, while the Vision Transformer model takes 2D images as input and creates the image patches for embedding and each patch is linearly embedded into a flat vector and treated as a ‘token,’ then the ‘tokens’ are processed by the Transformer layers to learn global contextual representations, enabling the model to perform signal classification task. This integration notably enhances the performance and capability of the model. Our DHET is especially effective for rolling bearing fault diagnosis. The simulation results show that the proposed DHET has higher classification accuracy for bearing fault diagnosis and outperforms CNN-based methods.</p></div>\",\"PeriodicalId\":49523,\"journal\":{\"name\":\"Signal Processing\",\"volume\":\"227 \",\"pages\":\"Article 109683\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165168424003037\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168424003037","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种新颖的双头集合变换器（DHET）算法，用于对轴承振动信号等具有时频特征的信号进行分类。DHET 模型采用时频双输入架构，集成了一维变换器模型和二维视觉变换器模型，以捕捉空间和时频特征。通过利用时域和时频域的数据，所提出的算法扩大了其特征提取能力，并增强了模型的泛化能力。在我们的 DHET 结构中，原始变换器模型利用自我注意机制来考虑信号输入分割之间的关系，这使其能够有效捕捉信号数据中的长距离依赖关系；而视觉变换器模型则将二维图像作为输入，并创建用于嵌入的图像补丁，将每个补丁线性嵌入到平面向量中，并将其视为 "令牌"，然后由变换器层处理 "令牌 "以学习全局上下文表征，从而使模型能够执行信号分类任务。这种整合显著提高了模型的性能和能力。我们的 DHET 对滚动轴承故障诊断特别有效。仿真结果表明，所提出的 DHET 在轴承故障诊断方面具有更高的分类精度，优于基于 CNN 的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Integrating self-attention mechanisms in deep learning: A novel dual-head ensemble transformer with its application to bearing fault diagnosis

In this paper, we propose a novel dual-head ensemble Transformer (DHET) algorithm for the classification of signals with time–frequency features such as bearing vibration signals. The DHET model employs a dual-input time–frequency architecture, integrating a 1D Transformer model and a 2D Vision Transformer model to capture the spatial and time–frequency features. By utilizing data from both the time and time–frequency domains, the proposed algorithm broadens its feature extraction capabilities and enhances the model’s capacity for generalization. In our DHET structure, the original Transformer model leverages self-attention mechanisms to consider relationships among signal input segmentations, which makes it effective at capturing long-range dependencies in signal data, while the Vision Transformer model takes 2D images as input and creates the image patches for embedding and each patch is linearly embedded into a flat vector and treated as a ‘token,’ then the ‘tokens’ are processed by the Transformer layers to learn global contextual representations, enabling the model to perform signal classification task. This integration notably enhances the performance and capability of the model. Our DHET is especially effective for rolling bearing fault diagnosis. The simulation results show that the proposed DHET has higher classification accuracy for bearing fault diagnosis and outperforms CNN-based methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Signal Processing 工程技术-工程：电子与电气

CiteScore

9.20

自引率

9.10%

发文量

309

审稿时长

41 days

期刊介绍： Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing. Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.