一种在异构多核、能量感知架构上映射任务的形式化方法

2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE) Pub Date : 2016-11-18 DOI:10.1109/MEMCOD.2016.7797760

Emilien Kofman, R. Simone

{"title":"一种在异构多核、能量感知架构上映射任务的形式化方法","authors":"Emilien Kofman, R. Simone","doi":"10.1109/MEMCOD.2016.7797760","DOIUrl":null,"url":null,"abstract":"The search for optimal mapping of application (tasks) onto processor architecture (resources) is always an acute issue, as new types of heterogeneous multicore architectures are being proposed constantly. The physical allocation and temporal scheduling can be attempted at a number of levels, from abstract mathematical models and operational research solvers, to practical simulation and run-time emulation. This work belongs to the first category. As often in the embedded domain we take as optimality metrics a combination of power consumption (to be minimized) and performance (to be maintained). One specificity is that we consider a dedicated architecture, namely the big.LITTLE ARM-based platform style that is found in recent Android smartphones. So now tasks can be executed either on fast, energy-costly cores, or slower energy-sober ones. The problem is even more complex since each processor may switch its running frequency, which is a natural trade-off between performance and power consumption. We consider also energy bonus when a full block (big or LITTLE) can be powered down. This dictates in the end a specific set of requirements and constraints, expressed with equations and inequations of a certain size, which must be fed to an appropriate solver (SMT solver in our case). Our original aim was (and still is) to consider whether these techniques would scale up in this case. We conducted experiments on several examples, and we describe more thoroughly a task graph application based on the tiled Cholesky decomposition algorithm, for its relevant size complexity. We comment on our findings and the modeling issues involved.","PeriodicalId":180873,"journal":{"name":"2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A formal approach to the mapping of tasks on an heterogenous multicore, energy-aware architecture\",\"authors\":\"Emilien Kofman, R. Simone\",\"doi\":\"10.1109/MEMCOD.2016.7797760\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The search for optimal mapping of application (tasks) onto processor architecture (resources) is always an acute issue, as new types of heterogeneous multicore architectures are being proposed constantly. The physical allocation and temporal scheduling can be attempted at a number of levels, from abstract mathematical models and operational research solvers, to practical simulation and run-time emulation. This work belongs to the first category. As often in the embedded domain we take as optimality metrics a combination of power consumption (to be minimized) and performance (to be maintained). One specificity is that we consider a dedicated architecture, namely the big.LITTLE ARM-based platform style that is found in recent Android smartphones. So now tasks can be executed either on fast, energy-costly cores, or slower energy-sober ones. The problem is even more complex since each processor may switch its running frequency, which is a natural trade-off between performance and power consumption. We consider also energy bonus when a full block (big or LITTLE) can be powered down. This dictates in the end a specific set of requirements and constraints, expressed with equations and inequations of a certain size, which must be fed to an appropriate solver (SMT solver in our case). Our original aim was (and still is) to consider whether these techniques would scale up in this case. We conducted experiments on several examples, and we describe more thoroughly a task graph application based on the tiled Cholesky decomposition algorithm, for its relevant size complexity. We comment on our findings and the modeling issues involved.\",\"PeriodicalId\":180873,\"journal\":{\"name\":\"2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MEMCOD.2016.7797760\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MEMCOD.2016.7797760","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

随着新型异构多核体系结构不断被提出，应用程序(任务)到处理器体系结构(资源)的最佳映射的搜索一直是一个尖锐的问题。物理分配和时间调度可以在多个层次上进行尝试，从抽象的数学模型和运筹学求解器，到实际仿真和运行时仿真。这项工作属于第一类。在嵌入式领域中，我们经常采用功耗(最小化)和性能(保持)的组合作为最优度量。一个特点是我们考虑一个专用的体系结构，即大。基于arm的平台风格出现在最近的Android智能手机中。因此，现在的任务既可以在快速、耗能的核心上执行，也可以在耗能较慢的核心上执行。这个问题更加复杂，因为每个处理器可能会切换其运行频率，这是性能和功耗之间的自然权衡。当一个完整的方块(大或小)可以关闭时，我们也会考虑能量奖励。这最终决定了一组特定的需求和约束，用一定大小的方程和不等式表示，必须将其提供给适当的求解器(在我们的例子中是SMT求解器)。我们最初的目标是(现在仍然是)考虑这些技术在这种情况下是否可以扩展。我们在几个例子上进行了实验，我们更彻底地描述了一个基于tile Cholesky分解算法的任务图应用程序，因为它具有相关的大小复杂性。我们对我们的发现和所涉及的建模问题发表评论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A formal approach to the mapping of tasks on an heterogenous multicore, energy-aware architecture

The search for optimal mapping of application (tasks) onto processor architecture (resources) is always an acute issue, as new types of heterogeneous multicore architectures are being proposed constantly. The physical allocation and temporal scheduling can be attempted at a number of levels, from abstract mathematical models and operational research solvers, to practical simulation and run-time emulation. This work belongs to the first category. As often in the embedded domain we take as optimality metrics a combination of power consumption (to be minimized) and performance (to be maintained). One specificity is that we consider a dedicated architecture, namely the big.LITTLE ARM-based platform style that is found in recent Android smartphones. So now tasks can be executed either on fast, energy-costly cores, or slower energy-sober ones. The problem is even more complex since each processor may switch its running frequency, which is a natural trade-off between performance and power consumption. We consider also energy bonus when a full block (big or LITTLE) can be powered down. This dictates in the end a specific set of requirements and constraints, expressed with equations and inequations of a certain size, which must be fed to an appropriate solver (SMT solver in our case). Our original aim was (and still is) to consider whether these techniques would scale up in this case. We conducted experiments on several examples, and we describe more thoroughly a task graph application based on the tiled Cholesky decomposition algorithm, for its relevant size complexity. We comment on our findings and the modeling issues involved.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)

自引率

0.00%

发文量