BHPVAS:用于修剪 BERT 模型中注意力头的视觉分析系统

IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Zhen Liu, Haibo Sun, Huawei Sun, Xinyu Hong, Gang Xu, Xiangyang Wu
{"title":"BHPVAS:用于修剪 BERT 模型中注意力头的视觉分析系统","authors":"Zhen Liu, Haibo Sun, Huawei Sun, Xinyu Hong, Gang Xu, Xiangyang Wu","doi":"10.1007/s12650-024-00985-z","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>In the field of deep learning, pre-trained BERT models have achieved remarkable success. However, the accompanying problem is that models with more complex structures and more network parameters. The huge parameter size makes the computational cost in terms of time and memory become extremely expensive. Recent work has indicated that BERT models own a significant amount of redundant attention heads. Meanwhile considerable BERT models compression algorithms have been proposed, which can effectively reduce model complexity and redundancy with pruning some attention heads. Nevertheless, existing automated model compression solutions are mainly based on predetermined pruning program, which requires multiple expensive pruning-retraining cycles or heuristic designs to select additional hyperparameters. Furthermore, the training process of BERT models is a black box, and lacks interpretability, which makes researchers cannot intuitively understand the optimization process of the model. In this paper, we propose a visual analysis system, BHPVAS, for pruning BERT models, which helps researchers to incorporate their understanding of model structure and operating mechanism into the model pruning process and generate pruning schemes. We propose three pruning criteria based on the attention data, namely, importance score, stability score, and similarity score, for evaluating the importance of self-attention heads. Additionally, we design multiple collaborative views to display the entire pruning process, guiding users to carry out pruning. Our system supports exploring the role of self-attention heads in the model inference process using text dependency relations and attention weight distribution. Finally, we conduct two case studies to demonstrate how to use the system for Sentiment Classification Sample Analysis and Pruning Scheme Exploration, verifying the effectiveness of the visual analysis system.</p><h3 data-test=\"abstract-sub-heading\">Graphical Abstract</h3>","PeriodicalId":54756,"journal":{"name":"Journal of Visualization","volume":"56 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BHPVAS: visual analysis system for pruning attention heads in BERT model\",\"authors\":\"Zhen Liu, Haibo Sun, Huawei Sun, Xinyu Hong, Gang Xu, Xiangyang Wu\",\"doi\":\"10.1007/s12650-024-00985-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Abstract</h3><p>In the field of deep learning, pre-trained BERT models have achieved remarkable success. However, the accompanying problem is that models with more complex structures and more network parameters. The huge parameter size makes the computational cost in terms of time and memory become extremely expensive. Recent work has indicated that BERT models own a significant amount of redundant attention heads. Meanwhile considerable BERT models compression algorithms have been proposed, which can effectively reduce model complexity and redundancy with pruning some attention heads. Nevertheless, existing automated model compression solutions are mainly based on predetermined pruning program, which requires multiple expensive pruning-retraining cycles or heuristic designs to select additional hyperparameters. Furthermore, the training process of BERT models is a black box, and lacks interpretability, which makes researchers cannot intuitively understand the optimization process of the model. In this paper, we propose a visual analysis system, BHPVAS, for pruning BERT models, which helps researchers to incorporate their understanding of model structure and operating mechanism into the model pruning process and generate pruning schemes. We propose three pruning criteria based on the attention data, namely, importance score, stability score, and similarity score, for evaluating the importance of self-attention heads. Additionally, we design multiple collaborative views to display the entire pruning process, guiding users to carry out pruning. Our system supports exploring the role of self-attention heads in the model inference process using text dependency relations and attention weight distribution. Finally, we conduct two case studies to demonstrate how to use the system for Sentiment Classification Sample Analysis and Pruning Scheme Exploration, verifying the effectiveness of the visual analysis system.</p><h3 data-test=\\\"abstract-sub-heading\\\">Graphical Abstract</h3>\",\"PeriodicalId\":54756,\"journal\":{\"name\":\"Journal of Visualization\",\"volume\":\"56 1\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Visualization\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s12650-024-00985-z\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visualization","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12650-024-00985-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

摘要 在深度学习领域,预训练 BERT 模型取得了显著的成就。然而,随之而来的问题是,模型结构越来越复杂,网络参数越来越多。巨大的参数规模使得计算成本在时间和内存方面变得异常昂贵。最近的研究表明,BERT 模型拥有大量冗余的注意力头。与此同时,人们提出了大量 BERT 模型压缩算法,通过修剪一些注意头可以有效降低模型的复杂度和冗余度。然而,现有的自动模型压缩解决方案主要基于预先确定的剪枝程序,这需要多次昂贵的剪枝-再训练循环或启发式设计来选择额外的超参数。此外,BERT 模型的训练过程是一个黑箱,缺乏可解释性,使得研究人员无法直观地理解模型的优化过程。本文提出了用于剪枝 BERT 模型的可视化分析系统 BHPVAS,帮助研究人员将对模型结构和运行机制的理解融入模型剪枝过程,并生成剪枝方案。我们根据注意力数据提出了三个剪枝标准,即重要性得分、稳定性得分和相似性得分,用于评估自我注意力头的重要性。此外,我们还设计了多个协作视图来显示整个剪枝过程,指导用户进行剪枝。我们的系统支持利用文本依赖关系和注意力权重分布来探索自我注意力头在模型推理过程中的作用。最后,我们进行了两个案例研究,演示了如何使用该系统进行情感分类样本分析和剪枝方案探索,验证了可视化分析系统的有效性。 图文摘要
本文章由计算机程序翻译,如有差异,请以英文原文为准。

BHPVAS: visual analysis system for pruning attention heads in BERT model

BHPVAS: visual analysis system for pruning attention heads in BERT model

Abstract

In the field of deep learning, pre-trained BERT models have achieved remarkable success. However, the accompanying problem is that models with more complex structures and more network parameters. The huge parameter size makes the computational cost in terms of time and memory become extremely expensive. Recent work has indicated that BERT models own a significant amount of redundant attention heads. Meanwhile considerable BERT models compression algorithms have been proposed, which can effectively reduce model complexity and redundancy with pruning some attention heads. Nevertheless, existing automated model compression solutions are mainly based on predetermined pruning program, which requires multiple expensive pruning-retraining cycles or heuristic designs to select additional hyperparameters. Furthermore, the training process of BERT models is a black box, and lacks interpretability, which makes researchers cannot intuitively understand the optimization process of the model. In this paper, we propose a visual analysis system, BHPVAS, for pruning BERT models, which helps researchers to incorporate their understanding of model structure and operating mechanism into the model pruning process and generate pruning schemes. We propose three pruning criteria based on the attention data, namely, importance score, stability score, and similarity score, for evaluating the importance of self-attention heads. Additionally, we design multiple collaborative views to display the entire pruning process, guiding users to carry out pruning. Our system supports exploring the role of self-attention heads in the model inference process using text dependency relations and attention weight distribution. Finally, we conduct two case studies to demonstrate how to use the system for Sentiment Classification Sample Analysis and Pruning Scheme Exploration, verifying the effectiveness of the visual analysis system.

Graphical Abstract

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Visualization
Journal of Visualization COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY
CiteScore
3.40
自引率
5.90%
发文量
79
审稿时长
>12 weeks
期刊介绍: Visualization is an interdisciplinary imaging science devoted to making the invisible visible through the techniques of experimental visualization and computer-aided visualization. The scope of the Journal is to provide a place to exchange information on the latest visualization technology and its application by the presentation of latest papers of both researchers and technicians.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信