Workshop on Memory System Performance and Correctness最新文献_第2页

A low overhead method for recovering unused memory inside regions 用于恢复区域内未使用内存的低开销方法

Workshop on Memory System Performance and Correctness Pub Date : 2013-06-16 DOI: 10.1145/2492408.2492415

Matthew Davis, P. Schachte, Z. Somogyi, H. Søndergaard

引用次数: 7

Program-centric cost models for locality 地方性的以项目为中心的成本模型

Workshop on Memory System Performance and Correctness Pub Date : 2013-06-16 DOI: 10.1145/2492408.2492417

G. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, H. Simhadri

引用次数: 4

Can seqlocks get along with programming language memory models? 序列锁可以与编程语言内存模型相处?

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247688

H. Boehm

引用次数: 49

Rank idle time prediction driven last-level cache writeback Rank空闲时间预测驱动最后一级缓存回写

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247690

Zhe Wang, S. Khan, Daniel A. Jiménez

引用次数: 6

Towards region-based memory management for Go 面向Go的基于区域的内存管理

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247695

Matthew Davis, P. Schachte, Z. Somogyi, H. Søndergaard

引用次数: 9

A higher order theory of locality 一个高阶的局部性理论

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247697

C. Ding, Xiaoya Xiang

引用次数: 7

Parallel memory defragmentation on a GPU GPU上的并行内存碎片整理

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247693

R. Veldema, M. Philippsen

{"title":"Parallel memory defragmentation on a GPU","authors":"R. Veldema, M. Philippsen","doi":"10.1145/2247684.2247693","DOIUrl":"https://doi.org/10.1145/2247684.2247693","url":null,"abstract":"High-throughput memory management techniques such as malloc/free or mark-and-sweep collectors often exhibit memory fragmentation leaving allocated objects interspersed with free memory holes. Memory defragmentation removes such holes by moving objects around in memory so that they become adjacent (compaction) and holes can be merged (coalesced) to form larger holes. However, known defragmentation techniques are slow. This paper presents a parallel solution to best-effort partial defragmentation that makes use of all available cores. The solution not only speeds up defragmentation times significantly, but it also scales for many simple cores. It can therefore even be implemented on a GPU.\u0000 One problem with compaction is that it requires all references to moved objects to be retargeted to point to their new locations. This paper further improves existing work by a better identification of the parts of the heap that contain references to objects moved by the compactor and only processes these parts to find the references that are then retargeted in parallel.\u0000 To demonstrate the performance of the new memory defragmentation algorithm on many-core processors, we show its performance on a modern GPU. Parallelization speeds up compaction 40 times and coalescing up to 32 times. After compaction, our algorithm only needs to process 2%--4% of the total heap to retarget references.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"447 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133281312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Analysis of pure methods using garbage collection 分析使用垃圾回收的纯方法

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247694

Erik Österlund, Welf Löwe

引用次数: 2

Can parallel data structures rely on automatic memory managers? 并行数据结构可以依赖于自动内存管理器吗?

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247685

E. Petrank

{"title":"Can parallel data structures rely on automatic memory managers?","authors":"E. Petrank","doi":"10.1145/2247684.2247685","DOIUrl":"https://doi.org/10.1145/2247684.2247685","url":null,"abstract":"The complexity of parallel data structures is often measured by two major factors: the throughput they provide and the progress they guarantee. Progress guarantees are particularly important for systems that require responsiveness such as real-time systems, operating systems, interactive systems, etc. Notions of progress guarantees such as lock-freedom, wait-freedom, and obstruction-freedom that provide different levels of guarantees have been proposed in the literature [4, 6]. Concurrent access (and furthermore, optimistic access) to shared objects makes the management of memory one of the more complex aspects of concurrent algorithms design. The use of automatic memory management greatly simplifies such algorithms [11, 3, 2, 9]. However, while the existence of lock-free garbage collection has been demonstrated [5], the existence of a practical automatic memory manager that supports lock-free or wait-free algorithms is still open. Furthermore, known schemes for manual reclamation of unused objects are difficult to use and impose a significant overhead on the execution [10].\u0000 It turns out that the memory management community is not fully aware of how dire the need is for memory managers that support progress guarantees for the design of concurrent data structures. Likewise, designers of concurrent data structures are not always aware of the fact that memory management with support for progress guarantees is not available. Closing this gap between these two communities is a major open problem for both communities.\u0000 In this talk we will examine the memory management needs of concurrent algorithms. Next, we will discuss how state-of-the-art research and practice deal with the fact that an important piece of technology is missing (e.g., [7, 1]). Finally, we will survey the currently available pieces in this puzzle (e.g., [13, 12, 8]) and specify which pieces are missing. This open problem is arguably the greatest challenge facing the memory management community today.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117218114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Trace-driven simulation of memory system scheduling in multithread application 多线程应用中内存系统调度的跟踪驱动仿真

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247691

Peng Fei Zhu, Mingyu Chen, Yungang Bao, Licheng Chen, Yongbing Huang

引用次数: 2