Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained Deep Neural Networks

Matteo Risso, A. Burrello, L. Benini, E. Macii, M. Poncino, D. J. Pagliari
{"title":"Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained Deep Neural Networks","authors":"Matteo Risso, A. Burrello, L. Benini, E. Macii, M. Poncino, D. J. Pagliari","doi":"10.1145/3531437.3539720","DOIUrl":null,"url":null,"abstract":"Neural Architecture Search (NAS) is increasingly popular to automatically explore the accuracy versus computational complexity trade-off of Deep Learning (DL) architectures. When targeting tiny edge devices, the main challenge for DL deployment is matching the tight memory constraints, hence most NAS algorithms consider model size as the complexity metric. Other methods reduce the energy or latency of DL models by trading off accuracy and number of inference operations. Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome this limitation proposing the first DNAS that directly addresses the most realistic scenario from a designer’s perspective: the co-optimization of accuracy and energy (or latency) under a memory constraint, determined by the target HW. We do so by combining two complexity-dependent loss functions during training, with independent strength. Testing on three edge-relevant tasks from the MLPerf Tiny benchmark suite, we obtain rich Pareto sets of architectures in the energy vs. accuracy space, with memory footprints constraints spanning from 75% to 6.25% of the baseline networks. When deployed on a commercial edge device, the STM NUCLEO-H743ZI2, our networks span a range of 2.18x in energy consumption and 4.04% in accuracy for the same memory constraint, and reduce energy by up to 2.2 × with negligible accuracy drop with respect to the baseline.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"107 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3531437.3539720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Neural Architecture Search (NAS) is increasingly popular to automatically explore the accuracy versus computational complexity trade-off of Deep Learning (DL) architectures. When targeting tiny edge devices, the main challenge for DL deployment is matching the tight memory constraints, hence most NAS algorithms consider model size as the complexity metric. Other methods reduce the energy or latency of DL models by trading off accuracy and number of inference operations. Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome this limitation proposing the first DNAS that directly addresses the most realistic scenario from a designer’s perspective: the co-optimization of accuracy and energy (or latency) under a memory constraint, determined by the target HW. We do so by combining two complexity-dependent loss functions during training, with independent strength. Testing on three edge-relevant tasks from the MLPerf Tiny benchmark suite, we obtain rich Pareto sets of architectures in the energy vs. accuracy space, with memory footprints constraints spanning from 75% to 6.25% of the baseline networks. When deployed on a commercial edge device, the STM NUCLEO-H743ZI2, our networks span a range of 2.18x in energy consumption and 4.04% in accuracy for the same memory constraint, and reduce energy by up to 2.2 × with negligible accuracy drop with respect to the baseline.
节能和内存约束深度神经网络的多重复杂性损失dna
神经架构搜索(NAS)越来越受欢迎,用于自动探索深度学习(DL)架构的准确性与计算复杂性之间的权衡。当目标是小型边缘设备时,深度学习部署的主要挑战是匹配严格的内存约束,因此大多数NAS算法将模型大小作为复杂性度量。其他方法通过权衡准确性和推理操作的数量来减少深度学习模型的能量或延迟。很少同时考虑能量和内存,特别是在低搜索成本的可微分NAS (DNAS)解决方案中。我们克服了这一限制,提出了第一个从设计师的角度直接解决最现实情况的dna:在内存约束下,由目标HW决定的准确性和能量(或延迟)的共同优化。我们通过在训练过程中结合两个复杂度相关的损失函数来实现这一目标,它们具有独立的强度。在MLPerf Tiny基准测试套件中的三个边缘相关任务上进行测试,我们在能量与精度空间中获得了丰富的Pareto架构集,内存占用约束从基准网络的75%到6.25%不等。当部署在商业边缘设备STM NUCLEO-H743ZI2上时,我们的网络在相同内存约束下的能耗范围为2.18倍,精度范围为4.04%,并且减少能量高达2.2倍,相对于基线的精度下降可以忽略不计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信