K. Madhu, Anuj Rao, Saptarsi Das, Krishna C. Madhava, S. Nandy, R. Narayan
{"title":"灵活的资源分配和管理上的应用图形ReNÉ MPSoC","authors":"K. Madhu, Anuj Rao, Saptarsi Das, Krishna C. Madhava, S. Nandy, R. Narayan","doi":"10.1145/2872421.2872426","DOIUrl":null,"url":null,"abstract":"Performance of an application on a many-core machine primarily hinges on the ability of the architecture to exploit parallelism and to provide fast memory accesses. Exploiting parallelism in static application graphs on a multicore target is relatively easy owing to the fact that compilers can map them onto an optimal set of processing elements and memory modules. Dynamic application graphs have computations and data dependencies that manifest at runtime and hence may not be schedulable statically. Load balancing of such graphs requires runtime support (such as support for work-stealing) but results in overheads due to data and code movement. In this work, we use ReNÉ MPSoC as an alternative to the traditional many-core processing platforms to target application kernel graphs. ReNÉ is designed to be used as an accelerator to a host and offers the ability to exploit massive parallelism at multiple granularities and supports work-stealing for dynamic load-balancing. Further, it offers handles to enable and disable work-stealing selectively. ReNÉ employs an explicitly managed global memory with minimal hardware support for address translation required for relocating application kernels. We present a resource management methodology on ReNE MPSoC that encompasses a lightweight resource management hardware module and a compilation flow. Our methodology aims at identifying resource requirements at compile time and create resource boundaries (per application kernel) to guarantee performance and maximize resource utilization. The approach offers similar flexibility in resource allocation as a dynamic scheduling runtime but guarantees performance since locality of reference of data and code can be ensured.","PeriodicalId":115716,"journal":{"name":"PARMA-DITAM '16","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Flexible resource allocation and management for application graphs on ReNÉ MPSoC\",\"authors\":\"K. Madhu, Anuj Rao, Saptarsi Das, Krishna C. Madhava, S. Nandy, R. Narayan\",\"doi\":\"10.1145/2872421.2872426\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Performance of an application on a many-core machine primarily hinges on the ability of the architecture to exploit parallelism and to provide fast memory accesses. Exploiting parallelism in static application graphs on a multicore target is relatively easy owing to the fact that compilers can map them onto an optimal set of processing elements and memory modules. Dynamic application graphs have computations and data dependencies that manifest at runtime and hence may not be schedulable statically. Load balancing of such graphs requires runtime support (such as support for work-stealing) but results in overheads due to data and code movement. In this work, we use ReNÉ MPSoC as an alternative to the traditional many-core processing platforms to target application kernel graphs. ReNÉ is designed to be used as an accelerator to a host and offers the ability to exploit massive parallelism at multiple granularities and supports work-stealing for dynamic load-balancing. Further, it offers handles to enable and disable work-stealing selectively. ReNÉ employs an explicitly managed global memory with minimal hardware support for address translation required for relocating application kernels. We present a resource management methodology on ReNE MPSoC that encompasses a lightweight resource management hardware module and a compilation flow. Our methodology aims at identifying resource requirements at compile time and create resource boundaries (per application kernel) to guarantee performance and maximize resource utilization. The approach offers similar flexibility in resource allocation as a dynamic scheduling runtime but guarantees performance since locality of reference of data and code can be ensured.\",\"PeriodicalId\":115716,\"journal\":{\"name\":\"PARMA-DITAM '16\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PARMA-DITAM '16\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2872421.2872426\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PARMA-DITAM '16","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872421.2872426","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Flexible resource allocation and management for application graphs on ReNÉ MPSoC
Performance of an application on a many-core machine primarily hinges on the ability of the architecture to exploit parallelism and to provide fast memory accesses. Exploiting parallelism in static application graphs on a multicore target is relatively easy owing to the fact that compilers can map them onto an optimal set of processing elements and memory modules. Dynamic application graphs have computations and data dependencies that manifest at runtime and hence may not be schedulable statically. Load balancing of such graphs requires runtime support (such as support for work-stealing) but results in overheads due to data and code movement. In this work, we use ReNÉ MPSoC as an alternative to the traditional many-core processing platforms to target application kernel graphs. ReNÉ is designed to be used as an accelerator to a host and offers the ability to exploit massive parallelism at multiple granularities and supports work-stealing for dynamic load-balancing. Further, it offers handles to enable and disable work-stealing selectively. ReNÉ employs an explicitly managed global memory with minimal hardware support for address translation required for relocating application kernels. We present a resource management methodology on ReNE MPSoC that encompasses a lightweight resource management hardware module and a compilation flow. Our methodology aims at identifying resource requirements at compile time and create resource boundaries (per application kernel) to guarantee performance and maximize resource utilization. The approach offers similar flexibility in resource allocation as a dynamic scheduling runtime but guarantees performance since locality of reference of data and code can be ensured.