A bounded memory allocator for software-defined global address spaces

François Gindraud, F. Rastello, Albert Cohen, François Broquedis
{"title":"A bounded memory allocator for software-defined global address spaces","authors":"François Gindraud, F. Rastello, Albert Cohen, François Broquedis","doi":"10.1145/2926697.2926709","DOIUrl":null,"url":null,"abstract":"This paper presents a memory allocator targeting manycore architec- tures with distributed memory. Among the family of Multi Processor System on Chip (MPSoC), these devices are composed of multiple nodes linked by an on-chip network; most nodes have multiple processors sharing a small local memory. While MPSoC typically excel on their performance-per-Watt ratio, they remain hard to program due to multilevel parallelism, explicit resource and memory management, and hardware constraints (limited memory, network topology). Typical programming frameworks for MPSoC leave much target-specific work to the programmer: combining threads or node-local OpenMP, software caching, explicit message passing (and sometimes, routing), with non-standard interfaces. More abstract, automatic frameworks exist, but they target large-scale clusters and do not model the hardware constraints of MPSoC. The memory allocator described in this paper is one component of a larger runtime system, called Givy, to support dynamic task graphs with automatic software caching and data-driven execution on MPSoC. To simplify the programmer’s view of memory, both runtime and program data objects live in a Global Address Space (GAS). To avoid address collisions when objects are dynamically allocated, and to manage virtual memory mappings across nodes, a GAS-aware memory allocator is required. This paper proposes such an allocator with the following properties: (1) it is free of inter-node synchronizations; (2) its node-local performance match that of state-of-the-art shared-memory allocators; (3) it provides node-local mechanisms to implement inter-node software caching within a GAS; (4) it is well suited for small memory systems (a few MB per node).","PeriodicalId":203550,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2926697.2926709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This paper presents a memory allocator targeting manycore architec- tures with distributed memory. Among the family of Multi Processor System on Chip (MPSoC), these devices are composed of multiple nodes linked by an on-chip network; most nodes have multiple processors sharing a small local memory. While MPSoC typically excel on their performance-per-Watt ratio, they remain hard to program due to multilevel parallelism, explicit resource and memory management, and hardware constraints (limited memory, network topology). Typical programming frameworks for MPSoC leave much target-specific work to the programmer: combining threads or node-local OpenMP, software caching, explicit message passing (and sometimes, routing), with non-standard interfaces. More abstract, automatic frameworks exist, but they target large-scale clusters and do not model the hardware constraints of MPSoC. The memory allocator described in this paper is one component of a larger runtime system, called Givy, to support dynamic task graphs with automatic software caching and data-driven execution on MPSoC. To simplify the programmer’s view of memory, both runtime and program data objects live in a Global Address Space (GAS). To avoid address collisions when objects are dynamically allocated, and to manage virtual memory mappings across nodes, a GAS-aware memory allocator is required. This paper proposes such an allocator with the following properties: (1) it is free of inter-node synchronizations; (2) its node-local performance match that of state-of-the-art shared-memory allocators; (3) it provides node-local mechanisms to implement inter-node software caching within a GAS; (4) it is well suited for small memory systems (a few MB per node).
用于软件定义的全局地址空间的有界内存分配器
本文提出了一种面向多核分布式存储体系结构的内存分配器。在多处理器片上系统(MPSoC)系列中,这些器件由片上网络连接的多个节点组成;大多数节点都有多个处理器共享一个小的本地内存。虽然MPSoC通常在每瓦性能比上表现出色,但由于多级并行性、显式资源和内存管理以及硬件限制(有限的内存、网络拓扑),它们仍然难以编程。MPSoC的典型编程框架将许多特定于目标的工作留给了程序员:将线程或节点本地OpenMP、软件缓存、显式消息传递(有时是路由)与非标准接口结合起来。存在更抽象的自动框架,但它们针对大规模集群,并且没有对MPSoC的硬件约束进行建模。本文描述的内存分配器是一个更大的运行时系统的一个组成部分,称为Givy,用于支持MPSoC上具有自动软件缓存和数据驱动执行的动态任务图。为了简化程序员对内存的看法,运行时和程序数据对象都位于全局地址空间(GAS)中。为了在动态分配对象时避免地址冲突,并管理跨节点的虚拟内存映射,需要一个支持gas的内存分配器。本文提出了这样一种分配器,它具有以下性质:(1)不存在节点间同步;(2)其节点本地性能与最先进的共享内存分配器相当;(3)它提供了节点本地机制来实现GAS中的节点间软件缓存;(4)它非常适合小内存系统(每个节点只有几MB)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信