Hosting an object heap on manycore hardware: an exploration

D. Ungar, Sam S. Adams
{"title":"Hosting an object heap on manycore hardware: an exploration","authors":"D. Ungar, Sam S. Adams","doi":"10.1145/1640134.1640149","DOIUrl":null,"url":null,"abstract":"In order to construct a test-bed for investigating new programming paradigms for future \"manycore\" systems (i.e. those with at least a thousand cores), we are building a Smalltalk virtual machine that attempts to efficiently use a collection of 56-on-chip caches of 64KB each to host a multi-megabyte object heap. In addition to the cost of inter-core communication, two hardware characteristics influenced our design: the absence of hardware-provided cache-coherence, and the inability to move a single object from one core's cache to another's without changing its address. Our design relies on an object table, and the exploitation of a user-managed caching regime for read-mostly objects. At almost every stage of our process, we obtained measurements in order to guide the evolution of our system.\n The architecture and performance characteristics of a manycore platform confound old intuitions by deviating from both traditional multicore systems and from distributed systems. The implementor confronts a wide variety of design choices, such as when to share address space, when to share memory as opposed to sending a message, and how to eke out the most performance from a memory system that is far more tightly integrated than a distributed system yet far less centralized than in a several-core system. Our system is far from complete, let alone optimal, but our experiences have helped us develop new intuitions needed to rise to the manycore software challenge.","PeriodicalId":344101,"journal":{"name":"Dynamic Languages Symposium","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dynamic Languages Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1640134.1640149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

In order to construct a test-bed for investigating new programming paradigms for future "manycore" systems (i.e. those with at least a thousand cores), we are building a Smalltalk virtual machine that attempts to efficiently use a collection of 56-on-chip caches of 64KB each to host a multi-megabyte object heap. In addition to the cost of inter-core communication, two hardware characteristics influenced our design: the absence of hardware-provided cache-coherence, and the inability to move a single object from one core's cache to another's without changing its address. Our design relies on an object table, and the exploitation of a user-managed caching regime for read-mostly objects. At almost every stage of our process, we obtained measurements in order to guide the evolution of our system. The architecture and performance characteristics of a manycore platform confound old intuitions by deviating from both traditional multicore systems and from distributed systems. The implementor confronts a wide variety of design choices, such as when to share address space, when to share memory as opposed to sending a message, and how to eke out the most performance from a memory system that is far more tightly integrated than a distributed system yet far less centralized than in a several-core system. Our system is far from complete, let alone optimal, but our experiences have helped us develop new intuitions needed to rise to the manycore software challenge.
在多核硬件上托管对象堆:一种探索
为了构建一个测试平台,用于研究未来“多核”系统(即至少有1000个核的系统)的新编程范式,我们正在构建一个Smalltalk虚拟机,该虚拟机试图有效地使用56个片上64KB的缓存集合来托管一个数兆字节的对象堆。除了核间通信的成本之外,两个硬件特征影响了我们的设计:缺乏硬件提供的缓存一致性,以及无法在不改变地址的情况下将单个对象从一个核心的缓存移动到另一个核心的缓存。我们的设计依赖于一个对象表,并利用用户管理的缓存机制来处理读取最多的对象。在我们的过程的几乎每一个阶段,我们获得测量,以指导我们的系统的发展。多核平台的体系结构和性能特征与传统的多核系统和分布式系统都有所不同,从而混淆了旧的直觉。实现者面临着各种各样的设计选择,例如何时共享地址空间,何时共享内存而不是发送消息,以及如何从内存系统中获得最大的性能,内存系统比分布式系统集成得更紧密,但比多核系统集中得更少。我们的系统还远远不够完整,更不用说最优了,但是我们的经验已经帮助我们开发出了应对多核心软件挑战所需的新直觉。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信