DyCE：动态配置退出深度学习压缩和实时缩放

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-04-09 DOI:10.1016/j.future.2025.107837

Qingyuan Wang , Barry Cardiff , Antoine Frappé , Benoit Larras , Deepu John

{"title":"DyCE：动态配置退出深度学习压缩和实时缩放","authors":"Qingyuan Wang , Barry Cardiff , Antoine Frappé , Benoit Larras , Deepu John","doi":"10.1016/j.future.2025.107837","DOIUrl":null,"url":null,"abstract":"<div><div>Conventional deep learning (DL) model compression methods affect all input samples equally. However, as samples vary in difficulty, a dynamic model that adapts computation based on sample complexity offers a novel perspective for compression and scaling. Despite this potential, existing dynamic techniques are typically monolithic and have model-specific implementations, limiting their generalizability as broad compression and scaling methods. Additionally, most deployed DL systems are fixed, and unable to adjust once deployed. This paper introduces DyCE, a dynamically configurable system that can adjust the performance-complexity trade-off of a DL model at runtime without needing re-initialization or re-deployment. DyCE achieves this by adding exit networks to intermediate layers, thus allowing early termination if results are acceptable. DyCE also decouples the design of exit networks from the base model itself, enabling its easy adaptation to new base models. We also propose methods for generating optimized configurations and determining exit network types and positions for dynamic trade-offs. By enabling simple configuration switching, DyCE enables fine-grained performance-complexity tuning in real-time. We demonstrate the effectiveness of DyCE through image classification tasks using deep convolutional neural networks (CNNs). DyCE significantly reduces computational complexity by 26.2% for ResNet<span><math><msub><mrow></mrow><mrow><mi>152</mi></mrow></msub></math></span>, 26.6% for ConvNextv2<span><math><msub><mrow></mrow><mrow><mi>tiny</mi></mrow></msub></math></span> and 32.0% for DaViT<span><math><msub><mrow></mrow><mrow><mi>base</mi></mrow></msub></math></span> on ImageNet validation set, with accuracy reductions of less than 0.5%.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"171 ","pages":"Article 107837"},"PeriodicalIF":6.2000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DyCE: Dynamically Configurable Exiting for deep learning compression and real-time scaling\",\"authors\":\"Qingyuan Wang , Barry Cardiff , Antoine Frappé , Benoit Larras , Deepu John\",\"doi\":\"10.1016/j.future.2025.107837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Conventional deep learning (DL) model compression methods affect all input samples equally. However, as samples vary in difficulty, a dynamic model that adapts computation based on sample complexity offers a novel perspective for compression and scaling. Despite this potential, existing dynamic techniques are typically monolithic and have model-specific implementations, limiting their generalizability as broad compression and scaling methods. Additionally, most deployed DL systems are fixed, and unable to adjust once deployed. This paper introduces DyCE, a dynamically configurable system that can adjust the performance-complexity trade-off of a DL model at runtime without needing re-initialization or re-deployment. DyCE achieves this by adding exit networks to intermediate layers, thus allowing early termination if results are acceptable. DyCE also decouples the design of exit networks from the base model itself, enabling its easy adaptation to new base models. We also propose methods for generating optimized configurations and determining exit network types and positions for dynamic trade-offs. By enabling simple configuration switching, DyCE enables fine-grained performance-complexity tuning in real-time. We demonstrate the effectiveness of DyCE through image classification tasks using deep convolutional neural networks (CNNs). DyCE significantly reduces computational complexity by 26.2% for ResNet<span><math><msub><mrow></mrow><mrow><mi>152</mi></mrow></msub></math></span>, 26.6% for ConvNextv2<span><math><msub><mrow></mrow><mrow><mi>tiny</mi></mrow></msub></math></span> and 32.0% for DaViT<span><math><msub><mrow></mrow><mrow><mi>base</mi></mrow></msub></math></span> on ImageNet validation set, with accuracy reductions of less than 0.5%.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"171 \",\"pages\":\"Article 107837\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X25001323\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25001323","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

传统的深度学习（DL）模型压缩方法对所有输入样本的影响相同。然而，由于样本的难度不同，基于样本复杂度的动态模型为压缩和缩放提供了一个新的视角。尽管有这种潜力，但现有的动态技术通常是单一的，并且具有特定于模型的实现，限制了它们作为广泛压缩和缩放方法的通用性。此外，大多数部署的深度学习系统是固定的，一旦部署就无法调整。DyCE是一个动态配置的系统，它可以在运行时调整DL模型的性能复杂性权衡，而无需重新初始化或重新部署。DyCE通过将退出网络添加到中间层来实现这一点，从而允许在结果可接受的情况下提前终止。DyCE还将退出网络的设计与基本模型本身解耦，使其能够轻松适应新的基本模型。我们还提出了生成优化配置和确定动态权衡的退出网络类型和位置的方法。通过支持简单的配置切换，DyCE支持细粒度的性能复杂性实时调优。我们通过使用深度卷积神经网络（cnn）的图像分类任务证明了DyCE的有效性。在ImageNet验证集上，DyCE显著降低了ResNet152的计算复杂度26.2%，ConvNextv2tiny的计算复杂度26.6%，DaViTbase的计算复杂度32.0%，精度降低不到0.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DyCE: Dynamically Configurable Exiting for deep learning compression and real-time scaling

Conventional deep learning (DL) model compression methods affect all input samples equally. However, as samples vary in difficulty, a dynamic model that adapts computation based on sample complexity offers a novel perspective for compression and scaling. Despite this potential, existing dynamic techniques are typically monolithic and have model-specific implementations, limiting their generalizability as broad compression and scaling methods. Additionally, most deployed DL systems are fixed, and unable to adjust once deployed. This paper introduces DyCE, a dynamically configurable system that can adjust the performance-complexity trade-off of a DL model at runtime without needing re-initialization or re-deployment. DyCE achieves this by adding exit networks to intermediate layers, thus allowing early termination if results are acceptable. DyCE also decouples the design of exit networks from the base model itself, enabling its easy adaptation to new base models. We also propose methods for generating optimized configurations and determining exit network types and positions for dynamic trade-offs. By enabling simple configuration switching, DyCE enables fine-grained performance-complexity tuning in real-time. We demonstrate the effectiveness of DyCE through image classification tasks using deep convolutional neural networks (CNNs). DyCE significantly reduces computational complexity by 26.2% for ResNet

_{152}

, 26.6% for ConvNextv2

_{tiny}

and 32.0% for DaViT

_{base}

on ImageNet validation set, with accuracy reductions of less than 0.5%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.