Baptiste Roux, M. Gautier, O. Sentieys, J. Delahaye
{"title":"Fast and Energy-Driven Design Space Exploration for Heterogeneous Architectures","authors":"Baptiste Roux, M. Gautier, O. Sentieys, J. Delahaye","doi":"10.1109/FCCM.2017.31","DOIUrl":null,"url":null,"abstract":"In the last years, the integration of specialized hardware accelerators in Multiprocessor System-on-Chip (MpSoC) led to a new kind of architectures combining both software (SW) and hardware (HW) computational resources. For these new Heterogeneous MpSoC (HMpSoC) architectures, performance and energy consumption depend on a large set of parameters such as the HW/SW partitioning, the type of HW implementation or the communication cost. Design Space Exploration (DSE) consists in adjusting these parameters while monitoring a set of metrics (execution time, power, energy efficiency) to find the best mapping of the application on the targeted architecture. With the shift from performance-aware to energy-aware designs, computer-aided design and development tools try to reduce the large design space by simplifying HW/SW mapping mechanisms. However, energy consumption is not well supported in most of DSE tools due to the difficulty to fast and accurately estimate the energy consumption. To this aim, this work introduces a DSE method based on an analytical power model to circumvent the computation time bottleneck of state-of-the-art DSE methods. This exploration method proposes to optimize the HW/SW partitioning and mapping under user-defined objectives, especially an energy constraint. It targets tiling-based parallel applications and relies on an analytical power model that provides the DSE framework with the execution time and energy of a HW/SW configuration. The power model parameters are obtained with the measurements of a tiny subset of the design space, which are then injected into two extraction functions to obtain analytical formulations of the execution time and the energy consumption of the computation kernel. The partitioning problem constraints are defined as a set of inequalities with Boolean, integer (discrete) and non-integer (continuous) variables within a Mixed Integer Linear Programming (MILP) framework. Then, the best configuration that minimizes the user objective (e.g. execution time or total energy consumption) can be efficiently determined using commercial or open source solvers within a second. This methodology was tested on a Zynq-based heterogeneous architecture with two application kernels: a matrix multiplication and a Stencil computation. The results show a minimum of 12% acceleration speed-up and energy saving compared to standard approaches. They also show that the most energy-efficient solution is application-and platform-dependent and moreover hardly predictable. Such method could be included in a complete framework with a multi-step exploration to obtain an energy-efficient mapping of a full application on HMpSoC and to open new opportunity for future computer-aided design tools.","PeriodicalId":124631,"journal":{"name":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2017.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the last years, the integration of specialized hardware accelerators in Multiprocessor System-on-Chip (MpSoC) led to a new kind of architectures combining both software (SW) and hardware (HW) computational resources. For these new Heterogeneous MpSoC (HMpSoC) architectures, performance and energy consumption depend on a large set of parameters such as the HW/SW partitioning, the type of HW implementation or the communication cost. Design Space Exploration (DSE) consists in adjusting these parameters while monitoring a set of metrics (execution time, power, energy efficiency) to find the best mapping of the application on the targeted architecture. With the shift from performance-aware to energy-aware designs, computer-aided design and development tools try to reduce the large design space by simplifying HW/SW mapping mechanisms. However, energy consumption is not well supported in most of DSE tools due to the difficulty to fast and accurately estimate the energy consumption. To this aim, this work introduces a DSE method based on an analytical power model to circumvent the computation time bottleneck of state-of-the-art DSE methods. This exploration method proposes to optimize the HW/SW partitioning and mapping under user-defined objectives, especially an energy constraint. It targets tiling-based parallel applications and relies on an analytical power model that provides the DSE framework with the execution time and energy of a HW/SW configuration. The power model parameters are obtained with the measurements of a tiny subset of the design space, which are then injected into two extraction functions to obtain analytical formulations of the execution time and the energy consumption of the computation kernel. The partitioning problem constraints are defined as a set of inequalities with Boolean, integer (discrete) and non-integer (continuous) variables within a Mixed Integer Linear Programming (MILP) framework. Then, the best configuration that minimizes the user objective (e.g. execution time or total energy consumption) can be efficiently determined using commercial or open source solvers within a second. This methodology was tested on a Zynq-based heterogeneous architecture with two application kernels: a matrix multiplication and a Stencil computation. The results show a minimum of 12% acceleration speed-up and energy saving compared to standard approaches. They also show that the most energy-efficient solution is application-and platform-dependent and moreover hardly predictable. Such method could be included in a complete framework with a multi-step exploration to obtain an energy-efficient mapping of a full application on HMpSoC and to open new opportunity for future computer-aided design tools.
在过去的几年中,将专用硬件加速器集成到多处理器片上系统(MpSoC)中,导致了一种结合软件(SW)和硬件(HW)计算资源的新型架构。对于这些新的异构MpSoC (HMpSoC)架构,性能和能耗取决于大量参数,如硬件/软件分区、硬件实现类型或通信成本。设计空间探索(Design Space Exploration, DSE)包括调整这些参数,同时监视一组指标(执行时间、功率、能源效率),以找到应用程序在目标体系结构上的最佳映射。随着设计从性能意识到能源意识的转变,计算机辅助设计和开发工具试图通过简化硬件/软件映射机制来减少巨大的设计空间。然而,由于难以快速准确地估计能耗,大多数DSE工具都不能很好地支持能耗。为此,本文引入了一种基于解析幂模型的DSE方法,以克服当前DSE方法的计算时间瓶颈。该探索方法提出了在用户自定义目标下,特别是在能量约束下优化硬件/软件划分和映射的方法。它的目标是基于平铺的并行应用程序,并依赖于分析能力模型,该模型为DSE框架提供了硬件/软件配置的执行时间和精力。功率模型参数通过对设计空间极小子集的测量得到,然后将其注入两个提取函数中,得到计算核的执行时间和能耗的解析表达式。在混合整数线性规划(MILP)框架下,将划分问题约束定义为布尔、整数(离散)和非整数(连续)变量的不等式集合。然后,可以使用商业或开源求解器在一秒钟内有效地确定最小化用户目标(例如执行时间或总能耗)的最佳配置。该方法在一个基于zynq的异构架构上进行了测试,该架构有两个应用程序内核:一个矩阵乘法和一个模板计算。结果表明,与标准方法相比,至少有12%的加速加速和节能。它们还表明,最节能的解决方案依赖于应用和平台,而且很难预测。这种方法可以包含在一个完整的框架中,通过多步骤的探索,获得HMpSoC上完整应用的节能映射,并为未来的计算机辅助设计工具开辟新的机会。