Workshop on Memory System Performance and Correctness最新文献_第3页

Supporting virtual memory in GPGPU without supporting precise exceptions 支持GPGPU中的虚拟内存，但不支持精确异常

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247698

Hyesoon Kim

引用次数: 11

Identifying optimal multicore cache hierarchies for loop-based parallel programs via reuse distance analysis 通过重用距离分析确定基于循环的并行程序的最佳多核缓存层次结构

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247687

Meng-Ju Wu, D. Yeung

引用次数: 18

Defensive loop tiling for multi-core processor 多核处理器的防御循环平铺

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247701

Bin Bao, Xiaoya Xiang

引用次数: 1

A study towards optimal data layout for GPU computing 面向GPU计算的最佳数据布局研究

Workshop on Memory System Performance and Correctness Pub Date : 2012-06-16 DOI: 10.1145/2247684.2247699

E. Zhang, Han Li, Xipeng Shen

引用次数: 7

Let there be light!: the future of memory systems is photonics and 3D stacking 要有光!存储系统的未来是光子学和3D堆叠

Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988926

K. Bergman, G. Hendry, Paul H. Hargrove, J. Shalf, B. Jacob, K. Hemmert, Arun Rodrigues, D. Resnick

引用次数: 14

Deferred gratification: engineering for high performance garbage collection from the get go 延迟满足:从一开始就进行高性能垃圾收集的工程

Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988930

Ivan Jibaja, S. Blackburn, M. Haghighat, K. McKinley

{"title":"Deferred gratification: engineering for high performance garbage collection from the get go","authors":"Ivan Jibaja, S. Blackburn, M. Haghighat, K. McKinley","doi":"10.1145/1988915.1988930","DOIUrl":"https://doi.org/10.1145/1988915.1988930","url":null,"abstract":"Implementing a new programming language system is a daunting task. A common trap is to punt on the design and engineering of exact garbage collection and instead opt for reference counting or conservative garbage collection (GC). For example, AppleScript#8482;, Perl, Python, and PHP implementers chose reference counting (RC) and Ruby chose conservative GC. Although easier to get working, reference counting has terrible performance and conservative GC is inflexible and performs poorly when allocation rates are high. However, high performance GC is central to performance for managed languages and only becoming more critical due to relatively lower memory bandwidth and higher memory latency of modern architectures. Unfortunately, retrofitting support for high performance collectors later is a formidable software engineering task due to their exact nature. Whether they realize it or not, implementers have three routes: (1) forge ahead with reference counting or conservative GC, and worry about the consequences later; (2) build the language on top of an existing managed runtime with exact GC, and tune the GC to scripting language workloads; or (3) engineer exact GC from the ground up and enjoy the correctness and performance benefits sooner rather than later.\u0000 We explore this conundrum using PHP, the most popular server side scripting language. PHP implements reference counting, mirroring scripting languages before it. Because reference counting is incomplete, the implementors must (a) also implement tracing to detect cyclic garbage, or (b) prohibit cyclic data structures, or (c) never reclaim cyclic garbage. PHP chose (a), AppleScript chose (b), and Perl chose (c). We characterize the memory behavior of five typical PHP programs to determine whether their implementation choice was a good one in light of the growing demand for high performance PHP. The memory behavior of these PHP programs is similar to other managed languages, such as Java#8482; ---they allocate many short lived objects, a large variety of object sizes, and the average allocated object size is small. These characteristics suggest copying generational GC will attain high performance.\u0000 Language implementers who are serious about correctness and performance need to understand deferred gratification: paying the software engineering cost of exact GC up front will deliver correctness and memory system performance later.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121129870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

How to fit program footprint curves 如何拟合程序占用曲线

Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988920

Xiaoya Xiang, Bin Bao

引用次数: 0

A programming model for deterministic task parallelism 确定性任务并行的编程模型

Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988918

Polyvios Pratikakis, H. Vandierendonck, Spyros Lyberis, Dimitrios S. Nikolopoulos

{"title":"A programming model for deterministic task parallelism","authors":"Polyvios Pratikakis, H. Vandierendonck, Spyros Lyberis, Dimitrios S. Nikolopoulos","doi":"10.1145/1988915.1988918","DOIUrl":"https://doi.org/10.1145/1988915.1988918","url":null,"abstract":"The currently dominant programming models to write software for multicore processors use threads that run over shared memory. However, as the core count increases, cache coherency protocols get very complex and ineffective, and maintaining a shared memory abstraction becomes expensive and impractical. Moreover, writing multithreaded programs is notoriously difficult, as the programmer needs to reason about all the possible thread interleavings and interactions, including the myriad of implicit, non-obvious, and often unpredictable thread interactions through shared memory. Overall, as processors get more cores and parallel software becomes mainstream, the shared memory model reaches its limits regarding ease of programming and efficiency.\u0000 This position paper presents two ideas aiming to solve the problem. First, we restrict the way the programmer expresses parallelism: The program is a collection of possibly recursive tasks, where each task is atomic and cannot communicate with any other task during its execution. Second, we relax the requirement for coherent shared memory: Each task defines its memory footprint, and is guaranteed to have exclusive access to that memory during its execution. Using this model, we can then define a runtime system that transparently performs the data transfers required among cores without cache coherency, and also produces a deterministic execution of the program, provably equivalent to its sequential elision.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131781935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

The impact of diverse memory architectures on multicore consumer software: an industrial perspective from the video games domain 多种内存架构对多核消费软件的影响:来自电子游戏领域的工业视角

Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988925

G. Russell, C. Riley, Neil Henning, Uwe Dolinsky, A. Richards, A. Donaldson, A. V. Amesfoort

引用次数: 0

Approximating inclusion-based points-to analysis 近似基于包容的分析点

Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988931

R. Nasre

引用次数: 5