A Fast Design Space Exploration Framework for the Deep Learning Accelerators: Work-in-Progress

2020 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Pub Date : 2020-09-20 DOI:10.1109/CODESISSS51650.2020.9244038

Alessio Colucci, Alberto Marchisio, Beatrice Bussolino, Voitech Mrazek, M. Martina, G. Masera, M. Shafique

{"title":"A Fast Design Space Exploration Framework for the Deep Learning Accelerators: Work-in-Progress","authors":"Alessio Colucci, Alberto Marchisio, Beatrice Bussolino, Voitech Mrazek, M. Martina, G. Masera, M. Shafique","doi":"10.1109/CODESISSS51650.2020.9244038","DOIUrl":null,"url":null,"abstract":"The Capsule Networks (CapsNets) is an advanced form of Convolutional Neural Network (CNN), capable of learning spatial relations and being invariant to transformations. CapsNets requires complex matrix operations which current accelerators are not optimized for, concerning both training and inference passes. Current state-of-the-art simulators and design space exploration (DSE) tools for DNN hardware neglect the modeling of training operations, while requiring long exploration times that slow down the complete design flow. These impediments restrict the real-world applications of CapsNets (e.g., autonomous driving and robotics) as well as the further development of DNNs in life-long learning scenarios that require training on low-power embedded devices. Towards this, we present XploreDL, a novel framework to perform fast yet high-fidelity DSE for both inference and training accelerators, supporting both CNNs and CapsNets operations. XploreDL enables a resource-efficient DSE for accelerators, focusing on power, area, and latency, highlighting Pareto-optimal solutions which can be a green-lit to expedite the design flow. XploreDL can reach the same fidelity as ARM's SCALE-sim, while providing 600x speedup and having a 50x lower memory-footprint. Preliminary results with a deep CapsNet model on MNIST for training accelerators show promising Pareto-optimal architectures with up to 0.4 TOPS/squared-mm and 800 fJ/op efficiency. With inference accelerators for AlexNet the Pareto-optimal solutions reach up to 1.8 TOPS/squared-mm and 200 fJ/op efficiency.","PeriodicalId":437802,"journal":{"name":"2020 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CODESISSS51650.2020.9244038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The Capsule Networks (CapsNets) is an advanced form of Convolutional Neural Network (CNN), capable of learning spatial relations and being invariant to transformations. CapsNets requires complex matrix operations which current accelerators are not optimized for, concerning both training and inference passes. Current state-of-the-art simulators and design space exploration (DSE) tools for DNN hardware neglect the modeling of training operations, while requiring long exploration times that slow down the complete design flow. These impediments restrict the real-world applications of CapsNets (e.g., autonomous driving and robotics) as well as the further development of DNNs in life-long learning scenarios that require training on low-power embedded devices. Towards this, we present XploreDL, a novel framework to perform fast yet high-fidelity DSE for both inference and training accelerators, supporting both CNNs and CapsNets operations. XploreDL enables a resource-efficient DSE for accelerators, focusing on power, area, and latency, highlighting Pareto-optimal solutions which can be a green-lit to expedite the design flow. XploreDL can reach the same fidelity as ARM's SCALE-sim, while providing 600x speedup and having a 50x lower memory-footprint. Preliminary results with a deep CapsNet model on MNIST for training accelerators show promising Pareto-optimal architectures with up to 0.4 TOPS/squared-mm and 800 fJ/op efficiency. With inference accelerators for AlexNet the Pareto-optimal solutions reach up to 1.8 TOPS/squared-mm and 200 fJ/op efficiency.

查看原文本刊更多论文

胶囊网络(Capsule Networks, CapsNets)是卷积神经网络(Convolutional Neural Network, CNN)的一种高级形式，具有学习空间关系和对变换不变性的能力。capnet需要复杂的矩阵操作，而当前的加速器并没有针对这些操作进行优化，包括训练和推理。目前用于深度神经网络硬件的最先进的模拟器和设计空间探索(DSE)工具忽略了训练操作的建模，同时需要长时间的探索，从而减慢了整个设计流程。这些障碍限制了capnet的实际应用(例如，自动驾驶和机器人)，以及需要在低功耗嵌入式设备上进行培训的终身学习场景中dnn的进一步发展。为此，我们提出了XploreDL，这是一个新的框架，可以为推理和训练加速器执行快速而高保真的DSE，支持cnn和capnet操作。XploreDL为加速器提供了资源高效的DSE，专注于功率、面积和延迟，突出了pareto最优解决方案，可以加快设计流程。XploreDL可以达到与ARM的SCALE-sim相同的保真度，同时提供600倍的加速和50倍的内存占用。在MNIST上使用深度CapsNet模型进行训练加速器的初步结果显示，pareto最优架构具有高达0.4 TOPS/平方毫米和800 fJ/op的效率。使用AlexNet的推理加速器，pareto最优解决方案可达到1.8 TOPS/平方毫米和200 fJ/op效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)

自引率

0.00%

发文量