Using memory mapping to support cactus stacks in work-stealing runtime systems

I. Lee, Silas Boyd-Wickizer, Zhiyi Huang, C. Leiserson
{"title":"Using memory mapping to support cactus stacks in work-stealing runtime systems","authors":"I. Lee, Silas Boyd-Wickizer, Zhiyi Huang, C. Leiserson","doi":"10.1145/1854273.1854324","DOIUrl":null,"url":null,"abstract":"Many multithreaded concurrency platforms that use a work-stealing runtime system incorporate a “cactus stack,” wherein a function's accesses to stack variables properly respect the function's calling ancestry, even when many of the functions operate in parallel. Unfortunately, such existing concurrency platforms fail to satisfy at least one of the following three desirable criteria: † full interoperability with legacy or third-party serial binaries that have been compiled to use an ordinary linear stack, † a scheduler that provides near-perfect linear speedup on applications with sufficient parallelism, and † bounded and efficient use of memory for the cactus stack. We have addressed this cactus-stack problem by modifying the Linux operating system kernel to provide support for thread-local memory mapping (TLMM). We have used TLMM to reimplement the cactus stack in the open-source Cilk-5 runtime system. The Cilk-M runtime system removes the linguistic distinction imposed by Cilk-5 between serial code and parallel code, erases Cilk-5's limitation that serial code cannot call parallel code, and provides full compatibility with existing serial calling conventions. The Cilk-M runtime system provides strong guarantees on scheduler performance and stack space. Benchmark results indicate that the performance of the prototype Cilk-M 1.0 is comparable to the Cilk 5.4.6 system, and the consumption of stack space is modest.","PeriodicalId":422461,"journal":{"name":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1854273.1854324","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

Abstract

Many multithreaded concurrency platforms that use a work-stealing runtime system incorporate a “cactus stack,” wherein a function's accesses to stack variables properly respect the function's calling ancestry, even when many of the functions operate in parallel. Unfortunately, such existing concurrency platforms fail to satisfy at least one of the following three desirable criteria: † full interoperability with legacy or third-party serial binaries that have been compiled to use an ordinary linear stack, † a scheduler that provides near-perfect linear speedup on applications with sufficient parallelism, and † bounded and efficient use of memory for the cactus stack. We have addressed this cactus-stack problem by modifying the Linux operating system kernel to provide support for thread-local memory mapping (TLMM). We have used TLMM to reimplement the cactus stack in the open-source Cilk-5 runtime system. The Cilk-M runtime system removes the linguistic distinction imposed by Cilk-5 between serial code and parallel code, erases Cilk-5's limitation that serial code cannot call parallel code, and provides full compatibility with existing serial calling conventions. The Cilk-M runtime system provides strong guarantees on scheduler performance and stack space. Benchmark results indicate that the performance of the prototype Cilk-M 1.0 is comparable to the Cilk 5.4.6 system, and the consumption of stack space is modest.
在偷取工作的运行时系统中使用内存映射来支持cactus堆栈
许多使用工作窃取运行时系统的多线程并发平台都包含“仙人掌堆栈”,其中函数对堆栈变量的访问适当地尊重函数的调用祖先,即使许多函数并行操作也是如此。不幸的是,这种现有的并发平台至少不能满足以下三个理想标准中的一个:†与使用普通线性堆栈编译的遗留或第三方串行二进制文件完全互操作性,†在具有足够并行性的应用程序上提供近乎完美的线性加速的调度器,以及†cactus堆栈的有限且有效的内存使用。我们通过修改Linux操作系统内核来提供对线程本地内存映射(TLMM)的支持,从而解决了这个cactus-stack问题。我们使用TLMM在开源的Cilk-5运行时系统中重新实现cactus堆栈。Cilk-M运行时系统消除了Cilk-5在串行代码和并行代码之间强加的语言区别,消除了Cilk-5串行代码不能调用并行代码的限制,并提供了与现有串行调用约定的完全兼容性。Cilk-M运行时系统为调度器性能和堆栈空间提供了强有力的保证。基准测试结果表明,原型Cilk- m 1.0的性能与Cilk 5.4.6系统相当,并且堆栈空间消耗适中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信