Dependability and Protection of Transformer Models Against Soft Errors on Text Embeddings

IF 2.5 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Device and Materials Reliability Pub Date : 2024-10-11 DOI:10.1109/TDMR.2024.3478753

Zhen Gao;Shuang Liu;Pedro Reviriego;Shanshan Liu;Fabrizio Lombardi

{"title":"Dependability and Protection of Transformer Models Against Soft Errors on Text Embeddings","authors":"Zhen Gao;Shuang Liu;Pedro Reviriego;Shanshan Liu;Fabrizio Lombardi","doi":"10.1109/TDMR.2024.3478753","DOIUrl":null,"url":null,"abstract":"Transformers have achieved remarkable success in diverse fields such as Natural Language Processing (NLP) and computer vision (CV). For pre-trained Transformer models involving text processing, embedding representations are important parameters, incurring a large volume of memory. Soft errors on embedding vectors can lead to incorrect inputs to Transformers, and if not corrected in time, accumulated errors may produce undesirable outcomes. This paper considers the dependability of text related Transformer models to accumulated errors on embedding parameters and takes three typical models in different applications as case studies: BERT based sentence emotion classification, T5 based text summarization, and CLIP based image classification. We first evaluate the dependability of the three models by injecting bit errors on embedding parameters; only errors on a few critical bits affect model performance. Based on this finding, we first propose an efficient selective protection for embedding parameters with small values, and then through scaling, we extend the scheme for models with large embedding parameters. Extensive simulation results show that the proposed protection scheme can effectively remove the impact of soft errors on task performance. In particular, the complexity overhead of the proposed scheme is negligible, and the additional memory overhead as encountered in the SEC scheme is avoided.","PeriodicalId":448,"journal":{"name":"IEEE Transactions on Device and Materials Reliability","volume":"25 1","pages":"54-65"},"PeriodicalIF":2.5000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Device and Materials Reliability","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10714418/","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Transformers have achieved remarkable success in diverse fields such as Natural Language Processing (NLP) and computer vision (CV). For pre-trained Transformer models involving text processing, embedding representations are important parameters, incurring a large volume of memory. Soft errors on embedding vectors can lead to incorrect inputs to Transformers, and if not corrected in time, accumulated errors may produce undesirable outcomes. This paper considers the dependability of text related Transformer models to accumulated errors on embedding parameters and takes three typical models in different applications as case studies: BERT based sentence emotion classification, T5 based text summarization, and CLIP based image classification. We first evaluate the dependability of the three models by injecting bit errors on embedding parameters; only errors on a few critical bits affect model performance. Based on this finding, we first propose an efficient selective protection for embedding parameters with small values, and then through scaling, we extend the scheme for models with large embedding parameters. Extensive simulation results show that the proposed protection scheme can effectively remove the impact of soft errors on task performance. In particular, the complexity overhead of the proposed scheme is negligible, and the additional memory overhead as encountered in the SEC scheme is avoided.

查看原文本刊更多论文

变压器模型的可依赖性和保护，防止文本嵌入出现软错误

变形金刚在自然语言处理（NLP）和计算机视觉（CV）等多个领域取得了显著的成功。对于涉及文本处理的预训练Transformer模型，嵌入表示是重要的参数，会占用大量内存。嵌入向量上的软误差会导致变压器输入错误，如果不及时纠正，累积误差可能会产生不良后果。本文考虑了文本相关Transformer模型对嵌入参数累积误差的依赖性，并以基于BERT的句子情感分类、基于T5的文本摘要和基于CLIP的图像分类这三种不同应用中的典型模型为例进行了研究。我们首先通过在嵌入参数中注入比特误差来评估三种模型的可靠性；只有几个关键位的错误会影响模型的性能。基于这一发现，我们首先对嵌入参数的小值提出了一种有效的选择性保护，然后通过缩放将该方案扩展到嵌入参数大的模型。大量的仿真结果表明，所提出的保护方案可以有效地消除软错误对任务性能的影响。特别是，所提出方案的复杂性开销可以忽略不计，并且避免了SEC方案中遇到的额外内存开销。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Device and Materials Reliability 工程技术-工程：电子与电气

CiteScore

4.80

自引率

5.00%

发文量

审稿时长

6-12 weeks

期刊介绍： The scope of the publication includes, but is not limited to Reliability of: Devices, Materials, Processes, Interfaces, Integrated Microsystems (including MEMS & Sensors), Transistors, Technology (CMOS, BiCMOS, etc.), Integrated Circuits (IC, SSI, MSI, LSI, ULSI, ELSI, etc.), Thin Film Transistor Applications. The measurement and understanding of the reliability of such entities at each phase, from the concept stage through research and development and into manufacturing scale-up, provides the overall database on the reliability of the devices, materials, processes, package and other necessities for the successful introduction of a product to market. This reliability database is the foundation for a quality product, which meets customer expectation. A product so developed has high reliability. High quality will be achieved because product weaknesses will have been found (root cause analysis) and designed out of the final product. This process of ever increasing reliability and quality will result in a superior product. In the end, reliability and quality are not one thing; but in a sense everything, which can be or has to be done to guarantee that the product successfully performs in the field under customer conditions. Our goal is to capture these advances. An additional objective is to focus cross fertilized communication in the state of the art of reliability of electronic materials and devices and provide fundamental understanding of basic phenomena that affect reliability. In addition, the publication is a forum for interdisciplinary studies on reliability. An overall goal is to provide leading edge/state of the art information, which is critically relevant to the creation of reliable products.