有效的扭曲最小化分层修剪

IF 18.6
Kaixin Xu;Zhe Wang;Runtao Huang;Xue Geng;Jie Lin;Xulei Yang;Min Wu;Xiaoli Li;Weisi Lin
{"title":"有效的扭曲最小化分层修剪","authors":"Kaixin Xu;Zhe Wang;Runtao Huang;Xue Geng;Jie Lin;Xulei Yang;Min Wu;Xiaoli Li;Weisi Lin","doi":"10.1109/TPAMI.2025.3586418","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a post-training pruning framework that jointly optimizes layerwise pruning to minimize model output distortion. Through theoretical and empirical analysis, we discover an important additivity property of output distortion from pruning weights/channels in DNNs. Leveraging this property, we reformulate pruning optimization as a combinatorial problem and solve it with dynamic programming, achieving linear time complexity and making the algorithm very fast on CPUs. Furthermore, we optimize additivity-derived distortions using Hessian-based Taylor approximation to enhance pruning efficiency, accompanied by fine-grained complexity reduction techniques. Our method is evaluated on various DNN architectures, including CNNs, ViTs, and object detectors, and on vision tasks such as image classification on CIFAR-10 and ImageNet, and 3D object detection and various datasets. We achieve SoTA with significant FLOPs reductions without accuracy loss. Specifically, on CIFAR-10, we achieve up to <inline-formula><tex-math>$27.9\\times$</tex-math></inline-formula>, <inline-formula><tex-math>$29.2\\times$</tex-math></inline-formula>, and <inline-formula><tex-math>$14.9\\times$</tex-math></inline-formula> FLOPs reductions on ResNet-32, VGG-16, and DenseNet-121, respectively. On ImageNet, we observe no accuracy loss with <inline-formula><tex-math>$1.69\\times$</tex-math></inline-formula> and <inline-formula><tex-math>$2\\times$</tex-math></inline-formula> FLOPs reductions on ResNet-50 and DeiT-Base, respectively. For 3D object detection, we achieve <inline-formula><tex-math>$\\mathbf {3.89}\\times, \\mathbf {3.72}\\times$</tex-math></inline-formula> FLOPs reductions on CenterPoint and PVRCNN models. These results demonstrate the effectiveness and practicality of our approach for improving model performance through layer-adaptive weight pruning.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 10","pages":"9298-9315"},"PeriodicalIF":18.6000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Distortion-Minimized Layerwise Pruning\",\"authors\":\"Kaixin Xu;Zhe Wang;Runtao Huang;Xue Geng;Jie Lin;Xulei Yang;Min Wu;Xiaoli Li;Weisi Lin\",\"doi\":\"10.1109/TPAMI.2025.3586418\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a post-training pruning framework that jointly optimizes layerwise pruning to minimize model output distortion. Through theoretical and empirical analysis, we discover an important additivity property of output distortion from pruning weights/channels in DNNs. Leveraging this property, we reformulate pruning optimization as a combinatorial problem and solve it with dynamic programming, achieving linear time complexity and making the algorithm very fast on CPUs. Furthermore, we optimize additivity-derived distortions using Hessian-based Taylor approximation to enhance pruning efficiency, accompanied by fine-grained complexity reduction techniques. Our method is evaluated on various DNN architectures, including CNNs, ViTs, and object detectors, and on vision tasks such as image classification on CIFAR-10 and ImageNet, and 3D object detection and various datasets. We achieve SoTA with significant FLOPs reductions without accuracy loss. Specifically, on CIFAR-10, we achieve up to <inline-formula><tex-math>$27.9\\\\times$</tex-math></inline-formula>, <inline-formula><tex-math>$29.2\\\\times$</tex-math></inline-formula>, and <inline-formula><tex-math>$14.9\\\\times$</tex-math></inline-formula> FLOPs reductions on ResNet-32, VGG-16, and DenseNet-121, respectively. On ImageNet, we observe no accuracy loss with <inline-formula><tex-math>$1.69\\\\times$</tex-math></inline-formula> and <inline-formula><tex-math>$2\\\\times$</tex-math></inline-formula> FLOPs reductions on ResNet-50 and DeiT-Base, respectively. For 3D object detection, we achieve <inline-formula><tex-math>$\\\\mathbf {3.89}\\\\times, \\\\mathbf {3.72}\\\\times$</tex-math></inline-formula> FLOPs reductions on CenterPoint and PVRCNN models. These results demonstrate the effectiveness and practicality of our approach for improving model performance through layer-adaptive weight pruning.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 10\",\"pages\":\"9298-9315\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11072282/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11072282/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们提出了一个训练后剪枝框架,该框架共同优化分层剪枝以最小化模型输出失真。通过理论和实证分析,我们发现了dnn中剪枝权值/通道的输出失真具有重要的可加性。利用这一特性,我们将剪枝优化重新表述为一个组合问题,并用动态规划方法求解,实现了线性时间复杂度,并使算法在cpu上运行速度非常快。此外,我们使用基于hessian的Taylor近似来优化可加性衍生的扭曲,以提高修剪效率,并伴随着细粒度的复杂性降低技术。我们的方法在各种DNN架构上进行了评估,包括cnn、vit和目标检测器,以及视觉任务,如CIFAR-10和ImageNet上的图像分类,以及3D目标检测和各种数据集。我们在没有精度损失的情况下显著降低了FLOPs,实现了SoTA。具体来说,在CIFAR-10上,我们分别在ResNet-32、VGG-16和DenseNet-121上实现了高达27.9倍、29.2倍和14.9倍的FLOPs降低。在ImageNet上,我们观察到没有精度损失,在ResNet-50和DeiT-Base上分别减少了1.69倍和2倍的FLOPs。对于3D目标检测,我们在CenterPoint和PVRCNN模型上实现了$\mathbf{3.89}\次、\mathbf{3.72}\次$ FLOPs约减。这些结果证明了我们的方法通过层自适应权值修剪来提高模型性能的有效性和实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient Distortion-Minimized Layerwise Pruning
In this paper, we propose a post-training pruning framework that jointly optimizes layerwise pruning to minimize model output distortion. Through theoretical and empirical analysis, we discover an important additivity property of output distortion from pruning weights/channels in DNNs. Leveraging this property, we reformulate pruning optimization as a combinatorial problem and solve it with dynamic programming, achieving linear time complexity and making the algorithm very fast on CPUs. Furthermore, we optimize additivity-derived distortions using Hessian-based Taylor approximation to enhance pruning efficiency, accompanied by fine-grained complexity reduction techniques. Our method is evaluated on various DNN architectures, including CNNs, ViTs, and object detectors, and on vision tasks such as image classification on CIFAR-10 and ImageNet, and 3D object detection and various datasets. We achieve SoTA with significant FLOPs reductions without accuracy loss. Specifically, on CIFAR-10, we achieve up to $27.9\times$, $29.2\times$, and $14.9\times$ FLOPs reductions on ResNet-32, VGG-16, and DenseNet-121, respectively. On ImageNet, we observe no accuracy loss with $1.69\times$ and $2\times$ FLOPs reductions on ResNet-50 and DeiT-Base, respectively. For 3D object detection, we achieve $\mathbf {3.89}\times, \mathbf {3.72}\times$ FLOPs reductions on CenterPoint and PVRCNN models. These results demonstrate the effectiveness and practicality of our approach for improving model performance through layer-adaptive weight pruning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信