DyCE:动态配置退出深度学习压缩和实时缩放

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Qingyuan Wang , Barry Cardiff , Antoine Frappé , Benoit Larras , Deepu John
{"title":"DyCE:动态配置退出深度学习压缩和实时缩放","authors":"Qingyuan Wang ,&nbsp;Barry Cardiff ,&nbsp;Antoine Frappé ,&nbsp;Benoit Larras ,&nbsp;Deepu John","doi":"10.1016/j.future.2025.107837","DOIUrl":null,"url":null,"abstract":"<div><div>Conventional deep learning (DL) model compression methods affect all input samples equally. However, as samples vary in difficulty, a dynamic model that adapts computation based on sample complexity offers a novel perspective for compression and scaling. Despite this potential, existing dynamic techniques are typically monolithic and have model-specific implementations, limiting their generalizability as broad compression and scaling methods. Additionally, most deployed DL systems are fixed, and unable to adjust once deployed. This paper introduces DyCE, a dynamically configurable system that can adjust the performance-complexity trade-off of a DL model at runtime without needing re-initialization or re-deployment. DyCE achieves this by adding exit networks to intermediate layers, thus allowing early termination if results are acceptable. DyCE also decouples the design of exit networks from the base model itself, enabling its easy adaptation to new base models. We also propose methods for generating optimized configurations and determining exit network types and positions for dynamic trade-offs. By enabling simple configuration switching, DyCE enables fine-grained performance-complexity tuning in real-time. We demonstrate the effectiveness of DyCE through image classification tasks using deep convolutional neural networks (CNNs). DyCE significantly reduces computational complexity by 26.2% for ResNet<span><math><msub><mrow></mrow><mrow><mi>152</mi></mrow></msub></math></span>, 26.6% for ConvNextv2<span><math><msub><mrow></mrow><mrow><mi>tiny</mi></mrow></msub></math></span> and 32.0% for DaViT<span><math><msub><mrow></mrow><mrow><mi>base</mi></mrow></msub></math></span> on ImageNet validation set, with accuracy reductions of less than 0.5%.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"171 ","pages":"Article 107837"},"PeriodicalIF":6.2000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DyCE: Dynamically Configurable Exiting for deep learning compression and real-time scaling\",\"authors\":\"Qingyuan Wang ,&nbsp;Barry Cardiff ,&nbsp;Antoine Frappé ,&nbsp;Benoit Larras ,&nbsp;Deepu John\",\"doi\":\"10.1016/j.future.2025.107837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Conventional deep learning (DL) model compression methods affect all input samples equally. However, as samples vary in difficulty, a dynamic model that adapts computation based on sample complexity offers a novel perspective for compression and scaling. Despite this potential, existing dynamic techniques are typically monolithic and have model-specific implementations, limiting their generalizability as broad compression and scaling methods. Additionally, most deployed DL systems are fixed, and unable to adjust once deployed. This paper introduces DyCE, a dynamically configurable system that can adjust the performance-complexity trade-off of a DL model at runtime without needing re-initialization or re-deployment. DyCE achieves this by adding exit networks to intermediate layers, thus allowing early termination if results are acceptable. DyCE also decouples the design of exit networks from the base model itself, enabling its easy adaptation to new base models. We also propose methods for generating optimized configurations and determining exit network types and positions for dynamic trade-offs. By enabling simple configuration switching, DyCE enables fine-grained performance-complexity tuning in real-time. We demonstrate the effectiveness of DyCE through image classification tasks using deep convolutional neural networks (CNNs). DyCE significantly reduces computational complexity by 26.2% for ResNet<span><math><msub><mrow></mrow><mrow><mi>152</mi></mrow></msub></math></span>, 26.6% for ConvNextv2<span><math><msub><mrow></mrow><mrow><mi>tiny</mi></mrow></msub></math></span> and 32.0% for DaViT<span><math><msub><mrow></mrow><mrow><mi>base</mi></mrow></msub></math></span> on ImageNet validation set, with accuracy reductions of less than 0.5%.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"171 \",\"pages\":\"Article 107837\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X25001323\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25001323","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

传统的深度学习(DL)模型压缩方法对所有输入样本的影响相同。然而,由于样本的难度不同,基于样本复杂度的动态模型为压缩和缩放提供了一个新的视角。尽管有这种潜力,但现有的动态技术通常是单一的,并且具有特定于模型的实现,限制了它们作为广泛压缩和缩放方法的通用性。此外,大多数部署的深度学习系统是固定的,一旦部署就无法调整。DyCE是一个动态配置的系统,它可以在运行时调整DL模型的性能复杂性权衡,而无需重新初始化或重新部署。DyCE通过将退出网络添加到中间层来实现这一点,从而允许在结果可接受的情况下提前终止。DyCE还将退出网络的设计与基本模型本身解耦,使其能够轻松适应新的基本模型。我们还提出了生成优化配置和确定动态权衡的退出网络类型和位置的方法。通过支持简单的配置切换,DyCE支持细粒度的性能复杂性实时调优。我们通过使用深度卷积神经网络(cnn)的图像分类任务证明了DyCE的有效性。在ImageNet验证集上,DyCE显著降低了ResNet152的计算复杂度26.2%,ConvNextv2tiny的计算复杂度26.6%,DaViTbase的计算复杂度32.0%,精度降低不到0.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
DyCE: Dynamically Configurable Exiting for deep learning compression and real-time scaling
Conventional deep learning (DL) model compression methods affect all input samples equally. However, as samples vary in difficulty, a dynamic model that adapts computation based on sample complexity offers a novel perspective for compression and scaling. Despite this potential, existing dynamic techniques are typically monolithic and have model-specific implementations, limiting their generalizability as broad compression and scaling methods. Additionally, most deployed DL systems are fixed, and unable to adjust once deployed. This paper introduces DyCE, a dynamically configurable system that can adjust the performance-complexity trade-off of a DL model at runtime without needing re-initialization or re-deployment. DyCE achieves this by adding exit networks to intermediate layers, thus allowing early termination if results are acceptable. DyCE also decouples the design of exit networks from the base model itself, enabling its easy adaptation to new base models. We also propose methods for generating optimized configurations and determining exit network types and positions for dynamic trade-offs. By enabling simple configuration switching, DyCE enables fine-grained performance-complexity tuning in real-time. We demonstrate the effectiveness of DyCE through image classification tasks using deep convolutional neural networks (CNNs). DyCE significantly reduces computational complexity by 26.2% for ResNet152, 26.6% for ConvNextv2tiny and 32.0% for DaViTbase on ImageNet validation set, with accuracy reductions of less than 0.5%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信