Acceleration-aware, Retraining-free Evolutionary Pruning for Automated Fitment of Deep Learning Models on Edge Devices

Jeet Dutta, Swarnava Dey, Arijit Mukherjee, Arpan Pal
{"title":"Acceleration-aware, Retraining-free Evolutionary Pruning for Automated Fitment of Deep Learning Models on Edge Devices","authors":"Jeet Dutta, Swarnava Dey, Arijit Mukherjee, Arpan Pal","doi":"10.1145/3564121.3564133","DOIUrl":null,"url":null,"abstract":"Deep Learning architectures used in computer vision, natural language and speech processing, unsupervised clustering, etc. have become highly complex and application-specific in recent times. Despite existing automated feature engineering techniques, building such complex models still requires extensive domain knowledge or a huge infrastructure for employing techniques such as Neural Architecture Search (NAS). Further, many industrial applications need in-premises decision-making close to sensors, thus making deployment of deep learning models on edge devices a desirable and often necessary option. Instead of freshly designing application-specific Deep Learning models, the transformation of already built models can achieve faster time to market and cost reduction. In this work, we present an efficient re-training-free model compression method that searches for the best hyper-parameters to reduce the model size and latency without losing any accuracy. Moreover, our proposed method takes into account any drop in accuracy due to hardware acceleration, when a Deep Neural Network is executed on accelerator hardware.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second International Conference on AI-ML Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3564121.3564133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Deep Learning architectures used in computer vision, natural language and speech processing, unsupervised clustering, etc. have become highly complex and application-specific in recent times. Despite existing automated feature engineering techniques, building such complex models still requires extensive domain knowledge or a huge infrastructure for employing techniques such as Neural Architecture Search (NAS). Further, many industrial applications need in-premises decision-making close to sensors, thus making deployment of deep learning models on edge devices a desirable and often necessary option. Instead of freshly designing application-specific Deep Learning models, the transformation of already built models can achieve faster time to market and cost reduction. In this work, we present an efficient re-training-free model compression method that searches for the best hyper-parameters to reduce the model size and latency without losing any accuracy. Moreover, our proposed method takes into account any drop in accuracy due to hardware acceleration, when a Deep Neural Network is executed on accelerator hardware.
边缘设备上深度学习模型自动拟合的加速感知、无需再训练的进化剪枝
近年来,用于计算机视觉、自然语言和语音处理、无监督聚类等领域的深度学习架构变得高度复杂和特定于应用。尽管现有的自动化特征工程技术,构建如此复杂的模型仍然需要广泛的领域知识或使用诸如神经结构搜索(NAS)等技术的庞大基础设施。此外,许多工业应用需要靠近传感器的内部决策,因此在边缘设备上部署深度学习模型是一种理想的选择,而且通常是必要的选择。而不是新设计特定于应用程序的深度学习模型,已经建立的模型的转换可以实现更快的上市时间和降低成本。在这项工作中,我们提出了一种有效的无需再训练的模型压缩方法,该方法在不损失任何准确性的情况下搜索最佳超参数来减小模型大小和延迟。此外,当在加速器硬件上执行深度神经网络时,我们提出的方法考虑了由于硬件加速而导致的精度下降。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信