BHPVAS: visual analysis system for pruning attention heads in BERT model

IF 1.7 4区计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Visualization Pub Date : 2024-04-12 DOI:10.1007/s12650-024-00985-z

Zhen Liu, Haibo Sun, Huawei Sun, Xinyu Hong, Gang Xu, Xiangyang Wu

{"title":"BHPVAS: visual analysis system for pruning attention heads in BERT model","authors":"Zhen Liu, Haibo Sun, Huawei Sun, Xinyu Hong, Gang Xu, Xiangyang Wu","doi":"10.1007/s12650-024-00985-z","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>In the field of deep learning, pre-trained BERT models have achieved remarkable success. However, the accompanying problem is that models with more complex structures and more network parameters. The huge parameter size makes the computational cost in terms of time and memory become extremely expensive. Recent work has indicated that BERT models own a significant amount of redundant attention heads. Meanwhile considerable BERT models compression algorithms have been proposed, which can effectively reduce model complexity and redundancy with pruning some attention heads. Nevertheless, existing automated model compression solutions are mainly based on predetermined pruning program, which requires multiple expensive pruning-retraining cycles or heuristic designs to select additional hyperparameters. Furthermore, the training process of BERT models is a black box, and lacks interpretability, which makes researchers cannot intuitively understand the optimization process of the model. In this paper, we propose a visual analysis system, BHPVAS, for pruning BERT models, which helps researchers to incorporate their understanding of model structure and operating mechanism into the model pruning process and generate pruning schemes. We propose three pruning criteria based on the attention data, namely, importance score, stability score, and similarity score, for evaluating the importance of self-attention heads. Additionally, we design multiple collaborative views to display the entire pruning process, guiding users to carry out pruning. Our system supports exploring the role of self-attention heads in the model inference process using text dependency relations and attention weight distribution. Finally, we conduct two case studies to demonstrate how to use the system for Sentiment Classification Sample Analysis and Pruning Scheme Exploration, verifying the effectiveness of the visual analysis system.</p><h3 data-test=\"abstract-sub-heading\">Graphical Abstract</h3>","PeriodicalId":54756,"journal":{"name":"Journal of Visualization","volume":"56 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visualization","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12650-024-00985-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of deep learning, pre-trained BERT models have achieved remarkable success. However, the accompanying problem is that models with more complex structures and more network parameters. The huge parameter size makes the computational cost in terms of time and memory become extremely expensive. Recent work has indicated that BERT models own a significant amount of redundant attention heads. Meanwhile considerable BERT models compression algorithms have been proposed, which can effectively reduce model complexity and redundancy with pruning some attention heads. Nevertheless, existing automated model compression solutions are mainly based on predetermined pruning program, which requires multiple expensive pruning-retraining cycles or heuristic designs to select additional hyperparameters. Furthermore, the training process of BERT models is a black box, and lacks interpretability, which makes researchers cannot intuitively understand the optimization process of the model. In this paper, we propose a visual analysis system, BHPVAS, for pruning BERT models, which helps researchers to incorporate their understanding of model structure and operating mechanism into the model pruning process and generate pruning schemes. We propose three pruning criteria based on the attention data, namely, importance score, stability score, and similarity score, for evaluating the importance of self-attention heads. Additionally, we design multiple collaborative views to display the entire pruning process, guiding users to carry out pruning. Our system supports exploring the role of self-attention heads in the model inference process using text dependency relations and attention weight distribution. Finally, we conduct two case studies to demonstrate how to use the system for Sentiment Classification Sample Analysis and Pruning Scheme Exploration, verifying the effectiveness of the visual analysis system.

Graphical Abstract

Abstract Image

查看原文本刊更多论文

BHPVAS：用于修剪 BERT 模型中注意力头的视觉分析系统

摘要在深度学习领域，预训练 BERT 模型取得了显著的成就。然而，随之而来的问题是，模型结构越来越复杂，网络参数越来越多。巨大的参数规模使得计算成本在时间和内存方面变得异常昂贵。最近的研究表明，BERT 模型拥有大量冗余的注意力头。与此同时，人们提出了大量 BERT 模型压缩算法，通过修剪一些注意头可以有效降低模型的复杂度和冗余度。然而，现有的自动模型压缩解决方案主要基于预先确定的剪枝程序，这需要多次昂贵的剪枝-再训练循环或启发式设计来选择额外的超参数。此外，BERT 模型的训练过程是一个黑箱，缺乏可解释性，使得研究人员无法直观地理解模型的优化过程。本文提出了用于剪枝 BERT 模型的可视化分析系统 BHPVAS，帮助研究人员将对模型结构和运行机制的理解融入模型剪枝过程，并生成剪枝方案。我们根据注意力数据提出了三个剪枝标准，即重要性得分、稳定性得分和相似性得分，用于评估自我注意力头的重要性。此外，我们还设计了多个协作视图来显示整个剪枝过程，指导用户进行剪枝。我们的系统支持利用文本依赖关系和注意力权重分布来探索自我注意力头在模型推理过程中的作用。最后，我们进行了两个案例研究，演示了如何使用该系统进行情感分类样本分析和剪枝方案探索，验证了可视化分析系统的有效性。图文摘要

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Visualization COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

CiteScore

3.40

自引率

5.90%

发文量

审稿时长

>12 weeks

期刊介绍： Visualization is an interdisciplinary imaging science devoted to making the invisible visible through the techniques of experimental visualization and computer-aided visualization. The scope of the Journal is to provide a place to exchange information on the latest visualization technology and its application by the presentation of latest papers of both researchers and technicians.