Hardware-Aware Machine Learning: Modeling and Optimization

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) Pub Date : 2018-09-14 DOI:10.1145/3240765.3243479

Diana Marculescu, Dimitrios Stamoulis, E. Cai

{"title":"Hardware-Aware Machine Learning: Modeling and Optimization","authors":"Diana Marculescu, Dimitrios Stamoulis, E. Cai","doi":"10.1145/3240765.3243479","DOIUrl":null,"url":null,"abstract":"Recent breakthroughs in Machine Learning (ML) applications, and especially in Deep Learning (DL), have made DL models a key component in almost every modern computing system. The increased popularity of DL applications deployed on a wide-spectrum of platforms (from mobile devices to datacenters) have resulted in a plethora of design challenges related to the constraints introduced by the hardware itself. “What is the latency or energy cost for an inference made by a Deep Neural Network (DNN)?” “Is it possible to predict this latency or energy consumption before a model is even trained?” “If yes, how can machine learners take advantage of these models to design the hardware-optimal DNN for deployment?” From lengthening battery life of mobile devices to reducing the runtime requirements of DL models executing in the cloud, the answers to these questions have drawn significant attention. One cannot optimize what isn't properly modeled. Therefore, it is important to understand the hardware efficiency of DL models during serving for making an inference, before even training the model. This key observation has motivated the use of predictive models to capture the hardware performance or energy efficiency of ML applications. Furthermore, ML practitioners are currently challenged with the task of designing the DNN model, i.e., of tuning the hyper-parameters of the DNN architecture, while optimizing for both accuracy of the DL model and its hardware efficiency. Therefore, state-of-the-art methodologies have proposed hardware-aware hyper-parameter optimization techniques. In this paper, we provide a comprehensive assessment of state-of-the-art work and selected results on the hardware-aware modeling and optimization for ML applications. We also highlight several open questions that are poised to give rise to novel hardware-aware designs in the next few years, as DL applications continue to significantly impact associated hardware systems and platforms.","PeriodicalId":413037,"journal":{"name":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3240765.3243479","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

Abstract

Recent breakthroughs in Machine Learning (ML) applications, and especially in Deep Learning (DL), have made DL models a key component in almost every modern computing system. The increased popularity of DL applications deployed on a wide-spectrum of platforms (from mobile devices to datacenters) have resulted in a plethora of design challenges related to the constraints introduced by the hardware itself. “What is the latency or energy cost for an inference made by a Deep Neural Network (DNN)?” “Is it possible to predict this latency or energy consumption before a model is even trained?” “If yes, how can machine learners take advantage of these models to design the hardware-optimal DNN for deployment?” From lengthening battery life of mobile devices to reducing the runtime requirements of DL models executing in the cloud, the answers to these questions have drawn significant attention. One cannot optimize what isn't properly modeled. Therefore, it is important to understand the hardware efficiency of DL models during serving for making an inference, before even training the model. This key observation has motivated the use of predictive models to capture the hardware performance or energy efficiency of ML applications. Furthermore, ML practitioners are currently challenged with the task of designing the DNN model, i.e., of tuning the hyper-parameters of the DNN architecture, while optimizing for both accuracy of the DL model and its hardware efficiency. Therefore, state-of-the-art methodologies have proposed hardware-aware hyper-parameter optimization techniques. In this paper, we provide a comprehensive assessment of state-of-the-art work and selected results on the hardware-aware modeling and optimization for ML applications. We also highlight several open questions that are poised to give rise to novel hardware-aware designs in the next few years, as DL applications continue to significantly impact associated hardware systems and platforms.

查看原文本刊更多论文

硬件感知机器学习:建模和优化

机器学习(ML)应用的最新突破，特别是深度学习(DL)，使深度学习模型成为几乎所有现代计算系统的关键组成部分。部署在各种平台(从移动设备到数据中心)上的深度学习应用程序越来越受欢迎，这导致了与硬件本身引入的限制相关的大量设计挑战。“深度神经网络(DNN)进行推理的延迟或能量成本是多少?”“有可能在模型训练之前预测这种延迟或能量消耗吗?”“如果是，机器学习者如何利用这些模型来设计用于部署的硬件优化深度神经网络?”从延长移动设备的电池寿命到减少在云中执行的深度学习模型的运行时需求，这些问题的答案引起了人们的极大关注。没有正确建模的东西是无法优化的。因此，在进行推理的过程中，甚至在训练模型之前，了解DL模型的硬件效率是很重要的。这一关键观察结果促使人们使用预测模型来捕捉机器学习应用程序的硬件性能或能源效率。此外，机器学习从业者目前面临的挑战是设计深度神经网络模型，即调整深度神经网络架构的超参数，同时优化深度学习模型的准确性和硬件效率。因此，最先进的方法提出了硬件感知的超参数优化技术。在本文中，我们对机器学习应用的硬件感知建模和优化方面的最新工作和选择结果进行了全面评估。我们还强调了几个悬而未决的问题，随着深度学习应用程序继续显著影响相关的硬件系统和平台，这些问题将在未来几年内产生新的硬件感知设计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

自引率

0.00%

发文量