Efficient Distortion-Minimized Layerwise Pruning

IF 18.6

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-07-07 DOI:10.1109/TPAMI.2025.3586418

Kaixin Xu;Zhe Wang;Runtao Huang;Xue Geng;Jie Lin;Xulei Yang;Min Wu;Xiaoli Li;Weisi Lin

{"title":"Efficient Distortion-Minimized Layerwise Pruning","authors":"Kaixin Xu;Zhe Wang;Runtao Huang;Xue Geng;Jie Lin;Xulei Yang;Min Wu;Xiaoli Li;Weisi Lin","doi":"10.1109/TPAMI.2025.3586418","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a post-training pruning framework that jointly optimizes layerwise pruning to minimize model output distortion. Through theoretical and empirical analysis, we discover an important additivity property of output distortion from pruning weights/channels in DNNs. Leveraging this property, we reformulate pruning optimization as a combinatorial problem and solve it with dynamic programming, achieving linear time complexity and making the algorithm very fast on CPUs. Furthermore, we optimize additivity-derived distortions using Hessian-based Taylor approximation to enhance pruning efficiency, accompanied by fine-grained complexity reduction techniques. Our method is evaluated on various DNN architectures, including CNNs, ViTs, and object detectors, and on vision tasks such as image classification on CIFAR-10 and ImageNet, and 3D object detection and various datasets. We achieve SoTA with significant FLOPs reductions without accuracy loss. Specifically, on CIFAR-10, we achieve up to <inline-formula><tex-math>$27.9\\times$</tex-math></inline-formula>, <inline-formula><tex-math>$29.2\\times$</tex-math></inline-formula>, and <inline-formula><tex-math>$14.9\\times$</tex-math></inline-formula> FLOPs reductions on ResNet-32, VGG-16, and DenseNet-121, respectively. On ImageNet, we observe no accuracy loss with <inline-formula><tex-math>$1.69\\times$</tex-math></inline-formula> and <inline-formula><tex-math>$2\\times$</tex-math></inline-formula> FLOPs reductions on ResNet-50 and DeiT-Base, respectively. For 3D object detection, we achieve <inline-formula><tex-math>$\\mathbf {3.89}\\times, \\mathbf {3.72}\\times$</tex-math></inline-formula> FLOPs reductions on CenterPoint and PVRCNN models. These results demonstrate the effectiveness and practicality of our approach for improving model performance through layer-adaptive weight pruning.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 10","pages":"9298-9315"},"PeriodicalIF":18.6000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11072282/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, we propose a post-training pruning framework that jointly optimizes layerwise pruning to minimize model output distortion. Through theoretical and empirical analysis, we discover an important additivity property of output distortion from pruning weights/channels in DNNs. Leveraging this property, we reformulate pruning optimization as a combinatorial problem and solve it with dynamic programming, achieving linear time complexity and making the algorithm very fast on CPUs. Furthermore, we optimize additivity-derived distortions using Hessian-based Taylor approximation to enhance pruning efficiency, accompanied by fine-grained complexity reduction techniques. Our method is evaluated on various DNN architectures, including CNNs, ViTs, and object detectors, and on vision tasks such as image classification on CIFAR-10 and ImageNet, and 3D object detection and various datasets. We achieve SoTA with significant FLOPs reductions without accuracy loss. Specifically, on CIFAR-10, we achieve up to

$27.9\times$

$29.2\times$

, and

$14.9\times$

FLOPs reductions on ResNet-32, VGG-16, and DenseNet-121, respectively. On ImageNet, we observe no accuracy loss with

$1.69\times$

and

$2\times$

FLOPs reductions on ResNet-50 and DeiT-Base, respectively. For 3D object detection, we achieve

$\mathbf {3.89}\times, \mathbf {3.72}\times$

FLOPs reductions on CenterPoint and PVRCNN models. These results demonstrate the effectiveness and practicality of our approach for improving model performance through layer-adaptive weight pruning.

查看原文本刊更多论文

有效的扭曲最小化分层修剪

在本文中，我们提出了一个训练后剪枝框架，该框架共同优化分层剪枝以最小化模型输出失真。通过理论和实证分析，我们发现了dnn中剪枝权值/通道的输出失真具有重要的可加性。利用这一特性，我们将剪枝优化重新表述为一个组合问题，并用动态规划方法求解，实现了线性时间复杂度，并使算法在cpu上运行速度非常快。此外，我们使用基于hessian的Taylor近似来优化可加性衍生的扭曲，以提高修剪效率，并伴随着细粒度的复杂性降低技术。我们的方法在各种DNN架构上进行了评估，包括cnn、vit和目标检测器，以及视觉任务，如CIFAR-10和ImageNet上的图像分类，以及3D目标检测和各种数据集。我们在没有精度损失的情况下显著降低了FLOPs，实现了SoTA。具体来说，在CIFAR-10上，我们分别在ResNet-32、VGG-16和DenseNet-121上实现了高达27.9倍、29.2倍和14.9倍的FLOPs降低。在ImageNet上，我们观察到没有精度损失，在ResNet-50和DeiT-Base上分别减少了1.69倍和2倍的FLOPs。对于3D目标检测，我们在CenterPoint和PVRCNN模型上实现了$\mathbf{3.89}\次、\mathbf{3.72}\次$ FLOPs约减。这些结果证明了我们的方法通过层自适应权值修剪来提高模型性能的有效性和实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量