灵活的资源分配和管理上的应用图形ReNÉ MPSoC

PARMA-DITAM '16 Pub Date : 2016-01-18 DOI:10.1145/2872421.2872426

K. Madhu, Anuj Rao, Saptarsi Das, Krishna C. Madhava, S. Nandy, R. Narayan

{"title":"灵活的资源分配和管理上的应用图形ReNÉ MPSoC","authors":"K. Madhu, Anuj Rao, Saptarsi Das, Krishna C. Madhava, S. Nandy, R. Narayan","doi":"10.1145/2872421.2872426","DOIUrl":null,"url":null,"abstract":"Performance of an application on a many-core machine primarily hinges on the ability of the architecture to exploit parallelism and to provide fast memory accesses. Exploiting parallelism in static application graphs on a multicore target is relatively easy owing to the fact that compilers can map them onto an optimal set of processing elements and memory modules. Dynamic application graphs have computations and data dependencies that manifest at runtime and hence may not be schedulable statically. Load balancing of such graphs requires runtime support (such as support for work-stealing) but results in overheads due to data and code movement. In this work, we use ReNÉ MPSoC as an alternative to the traditional many-core processing platforms to target application kernel graphs. ReNÉ is designed to be used as an accelerator to a host and offers the ability to exploit massive parallelism at multiple granularities and supports work-stealing for dynamic load-balancing. Further, it offers handles to enable and disable work-stealing selectively. ReNÉ employs an explicitly managed global memory with minimal hardware support for address translation required for relocating application kernels. We present a resource management methodology on ReNE MPSoC that encompasses a lightweight resource management hardware module and a compilation flow. Our methodology aims at identifying resource requirements at compile time and create resource boundaries (per application kernel) to guarantee performance and maximize resource utilization. The approach offers similar flexibility in resource allocation as a dynamic scheduling runtime but guarantees performance since locality of reference of data and code can be ensured.","PeriodicalId":115716,"journal":{"name":"PARMA-DITAM '16","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Flexible resource allocation and management for application graphs on ReNÉ MPSoC\",\"authors\":\"K. Madhu, Anuj Rao, Saptarsi Das, Krishna C. Madhava, S. Nandy, R. Narayan\",\"doi\":\"10.1145/2872421.2872426\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Performance of an application on a many-core machine primarily hinges on the ability of the architecture to exploit parallelism and to provide fast memory accesses. Exploiting parallelism in static application graphs on a multicore target is relatively easy owing to the fact that compilers can map them onto an optimal set of processing elements and memory modules. Dynamic application graphs have computations and data dependencies that manifest at runtime and hence may not be schedulable statically. Load balancing of such graphs requires runtime support (such as support for work-stealing) but results in overheads due to data and code movement. In this work, we use ReNÉ MPSoC as an alternative to the traditional many-core processing platforms to target application kernel graphs. ReNÉ is designed to be used as an accelerator to a host and offers the ability to exploit massive parallelism at multiple granularities and supports work-stealing for dynamic load-balancing. Further, it offers handles to enable and disable work-stealing selectively. ReNÉ employs an explicitly managed global memory with minimal hardware support for address translation required for relocating application kernels. We present a resource management methodology on ReNE MPSoC that encompasses a lightweight resource management hardware module and a compilation flow. Our methodology aims at identifying resource requirements at compile time and create resource boundaries (per application kernel) to guarantee performance and maximize resource utilization. The approach offers similar flexibility in resource allocation as a dynamic scheduling runtime but guarantees performance since locality of reference of data and code can be ensured.\",\"PeriodicalId\":115716,\"journal\":{\"name\":\"PARMA-DITAM '16\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PARMA-DITAM '16\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2872421.2872426\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PARMA-DITAM '16","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872421.2872426","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

多核机器上应用程序的性能主要取决于架构利用并行性和提供快速内存访问的能力。在多核目标上利用静态应用程序图中的并行性相对容易，因为编译器可以将它们映射到一组最佳的处理元素和内存模块。动态应用程序图具有在运行时显示的计算和数据依赖关系，因此可能无法静态地进行调度。这种图的负载平衡需要运行时支持(比如对工作窃取的支持)，但由于数据和代码移动，会导致开销。在这项工作中，我们使用ReNÉ MPSoC作为传统多核处理平台的替代方案来针对应用程序内核图。ReNÉ被设计用作主机的加速器，它提供了在多个粒度上利用大规模并行性的能力，并支持为动态负载平衡窃取工作。此外，它还提供了选择性地启用和禁用窃取工作的手柄。ReNÉ采用显式管理的全局内存，对重定位应用程序内核所需的地址转换提供最小的硬件支持。我们提出了一种基于ReNE MPSoC的资源管理方法，该方法包含轻量级资源管理硬件模块和编译流。我们的方法旨在确定编译时的资源需求，并创建资源边界(每个应用程序内核)，以保证性能和最大限度地利用资源。该方法在资源分配方面提供了与动态调度运行时类似的灵活性，但保证了性能，因为可以确保数据和代码引用的局部性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Flexible resource allocation and management for application graphs on ReNÉ MPSoC

Performance of an application on a many-core machine primarily hinges on the ability of the architecture to exploit parallelism and to provide fast memory accesses. Exploiting parallelism in static application graphs on a multicore target is relatively easy owing to the fact that compilers can map them onto an optimal set of processing elements and memory modules. Dynamic application graphs have computations and data dependencies that manifest at runtime and hence may not be schedulable statically. Load balancing of such graphs requires runtime support (such as support for work-stealing) but results in overheads due to data and code movement. In this work, we use ReNÉ MPSoC as an alternative to the traditional many-core processing platforms to target application kernel graphs. ReNÉ is designed to be used as an accelerator to a host and offers the ability to exploit massive parallelism at multiple granularities and supports work-stealing for dynamic load-balancing. Further, it offers handles to enable and disable work-stealing selectively. ReNÉ employs an explicitly managed global memory with minimal hardware support for address translation required for relocating application kernels. We present a resource management methodology on ReNE MPSoC that encompasses a lightweight resource management hardware module and a compilation flow. Our methodology aims at identifying resource requirements at compile time and create resource boundaries (per application kernel) to guarantee performance and maximize resource utilization. The approach offers similar flexibility in resource allocation as a dynamic scheduling runtime but guarantees performance since locality of reference of data and code can be ensured.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PARMA-DITAM '16

自引率

0.00%

发文量