Workshop on Memory System Performance and Correctness最新文献

筛选
英文 中文
There is nothing wrong with out-of-thin-air: compiler optimization and memory models 虚幻的编译器优化和内存模型并没有错
Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988917
Clark Verbrugge, Allan Kielstra, Yi Zhang
{"title":"There is nothing wrong with out-of-thin-air: compiler optimization and memory models","authors":"Clark Verbrugge, Allan Kielstra, Yi Zhang","doi":"10.1145/1988915.1988917","DOIUrl":"https://doi.org/10.1145/1988915.1988917","url":null,"abstract":"Memory models are used in concurrent systems to specify visibility properties of shared data. A practical memory model, however, must permit code optimization as well as provide a useful semantics for programmers. Here we extend recent observations that the current Java memory model imposes significant restrictions on the ability to optimize code. Beyond the known and potentially correctable proof concerns illustrated by others we show that major constraints on code generation and optimization can in fact be derived from fundamental properties and guarantees provided by the memory model. To address this and accommodate a better balance between programmability and optimization we present ideas for a simple concurrency semantics for Java that avoids basic problems at a cost of backward compatibility.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123773307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Extended sequential reasoning for data-race-free programs 无数据竞争程序的扩展顺序推理
Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988922
Laura Effinger-Dean, H. Boehm, Dhruva R. Chakrabarti, P. Joisha
{"title":"Extended sequential reasoning for data-race-free programs","authors":"Laura Effinger-Dean, H. Boehm, Dhruva R. Chakrabarti, P. Joisha","doi":"10.1145/1988915.1988922","DOIUrl":"https://doi.org/10.1145/1988915.1988922","url":null,"abstract":"Most multithreaded programming languages prohibit or discourage data races. By avoiding data races, we are guaranteed that variables accessed within a synchronization-free code region cannot be modified by other threads, allowing us to reason about such code regions as though they were single-threaded. However, such single-threaded reasoning is not limited to synchronization-free regions. We present a simple characterization of extended interference-free regions in which variables cannot be modified by other threads.\u0000 This characterization shows that, in the absence of data races, important code analysis problems often have surprisingly easy answers. For instance, we can use local analysis to determine when lock and unlock calls refer to the same mutex. Our characterization can be derived from prior work on safe compiler transformations, but it can also be simply derived from first principles, and justified in a very broad context. In addition, systematic reasoning about overlapping interference-free regions yields insight about optimization opportunities that were not previously apparent.\u0000 We give preliminary results for a prototype implementation of interference-free regions in the LLVM compiler infrastructure. We also discuss other potential applications for interference-free regions.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131207478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Data-race exceptions have benefits beyond the memory model 数据竞争异常的好处超出了内存模型
Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988923
Benjamin P. Wood, L. Ceze, D. Grossman
{"title":"Data-race exceptions have benefits beyond the memory model","authors":"Benjamin P. Wood, L. Ceze, D. Grossman","doi":"10.1145/1988915.1988923","DOIUrl":"https://doi.org/10.1145/1988915.1988923","url":null,"abstract":"Proposals to treat data races as exceptions provide simplified semantics for shared-memory multithreaded programming languages and memory models by guaranteeing that execution remains data-race-free and sequentially consistent or an exception is raised. However, the high cost of precise race detection has kept the cost-to-benefit ratio of data-race exceptions too high for widespread adoption. Most research to improve this ratio focuses on lowering performance cost.\u0000 In this position paper, we argue that with small changes in how we view data races, data-race exceptions enable a broad class of benefits beyond the memory model, including performance and simplicity in applications at the runtime system level. When attempted (but exception-raising) racy accesses are treated as legal --- but exceptional --- behavior, applications can exploit the guarantees of the data-race exception mechanism by performing potentially racy accesses and guiding execution based on whether these potential races manifest as exceptions. We apply these insights to concurrent garbage collection, optimistic synchronization elision, and best-effort automatic recovery from exceptions due to sequential-consistency-violating races.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114181913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Performance implications of fence-based memory models 基于栅栏的内存模型的性能含义
Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988919
H. Boehm
{"title":"Performance implications of fence-based memory models","authors":"H. Boehm","doi":"10.1145/1988915.1988919","DOIUrl":"https://doi.org/10.1145/1988915.1988919","url":null,"abstract":"Most mainstream shared-memory parallel programming languages are converging to a memory model, or shared variable semantics, centered on providing sequential consistency for most data-race-free programs.\u0000 OpenMP, along with a small number of other languages, defines its memory model in terms of implicit fence (e.g. OpenMP flush) operations that force memory accesses to become visible to other threads in order. Synchronization operations provided by the language implicitly include such fences. In the simplest cases this is equivalent to a promise of sequential consistency for data-race-free programs.\u0000 However, real languages typically also provide atomic operations with weak memory ordering constraints, such as the OpenMP atomic directives. These break the above equivalence, making the fence-based model stronger in ways that are observable, but not generally useful. As a result, conventional lock implementations are often accidentally prohibited, adding significant overhead for uncontended locks.\u0000 We show that this problem affects both OpenMP and, in a more subtle way, UPC. We have been working with the OpenMP ARB to resolve these issues in future versions of OpenMP.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126903969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Minor memory references matter in collaborative caching 次要内存引用在协作缓存中很重要
Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI: 10.1145/1988915.1988927
Xiaoming Gu
{"title":"Minor memory references matter in collaborative caching","authors":"Xiaoming Gu","doi":"10.1145/1988915.1988927","DOIUrl":"https://doi.org/10.1145/1988915.1988927","url":null,"abstract":"Collaborative caching uses different caching methods, e. g., LRU and MRU, for data with good or poor locality. Poorlocality data are evicted by MRU quickly, leaving most cache space to hold good-locality data by LRU. In our previous study, we selected static memory references with poor locality to use MRU but neglected minor references, which are memory instructions that contribute no more than 0.1% total memory accesses. After removing this restriction, we found that three SPEC CPU benchmarks have on average 6.2 times fewer miss reduction or 9.8% reduction in absolute miss ratio.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114519788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Garbage collection for multicore NUMA machines 多核NUMA机器的垃圾收集
Workshop on Memory System Performance and Correctness Pub Date : 2011-05-12 DOI: 10.1145/1988915.1988929
Sven Auhagen, Lars Bergstrom, M. Fluet, John H. Reppy
{"title":"Garbage collection for multicore NUMA machines","authors":"Sven Auhagen, Lars Bergstrom, M. Fluet, John H. Reppy","doi":"10.1145/1988915.1988929","DOIUrl":"https://doi.org/10.1145/1988915.1988929","url":null,"abstract":"Modern high-end machines feature multiple processor packages, each of which contains multiple independent cores and integrated memory controllers connected directly to dedicated physical RAM. These packages are connected via a shared bus, creating a system with a heterogeneous memory hierarchy. Since this shared bus has less bandwidth than the sum of the links to memory, aggregate memory bandwidth is higher when parallel threads all access memory local to their processor package than when they access memory attached to a remote package. This bandwidth limitation has traditionally limited the scalability of modern functional language implementations, which seldom scale well past 8 cores, even on small benchmarks.\u0000 This work presents a garbage collector integrated with our strict, parallel functional language implementation, Manticore, and shows that it scales effectively on both a 48-core AMD Opteron machine and a 32-core Intel Xeon machine.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"2003 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113966530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
The case for simple, visible cache coherency 简单、可见的缓存一致性
Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353532
R. Kunz, M. Horowitz
{"title":"The case for simple, visible cache coherency","authors":"R. Kunz, M. Horowitz","doi":"10.1145/1353522.1353532","DOIUrl":"https://doi.org/10.1145/1353522.1353532","url":null,"abstract":"The shared memory research community has proposed many complex communication protocols that aim to eliminate specific performance bottlenecks, while still providing an easy-to-use communication interface. Although tailored protocols can eliminate some bottlenecks that arise in real applications, removing the cause of the bottleneck through software optimizations and bug fixes is cheaper to implement, faster to fix (once found), and requires no additional support by the hardware beyond a simple shared memory interface. In fact, in our experience, the choice of coherence protocol is much less important than providing an efficient hardware feedback that indentifies the source of the problem. Future cache-coherence research should focus efforts on illuminating memory system behavior, providing smarter tools to identify bottlenecks, and helping to eliminate them through software optimizations.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133461034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
GC assertions: using the garbage collector to check heap properties GC断言:使用垃圾收集器检查堆属性
Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353533
E. Aftandilian, Samuel Z. Guyer
{"title":"GC assertions: using the garbage collector to check heap properties","authors":"E. Aftandilian, Samuel Z. Guyer","doi":"10.1145/1353522.1353533","DOIUrl":"https://doi.org/10.1145/1353522.1353533","url":null,"abstract":"This paper introduces GC assertions, a system interface that programmers can use to check for errors, such as data structure invariant violations, and to diagnose performance problems, such as memory leaks. GC assertions are checked by the garbage collector, which is in a unique position to gather information and answer questions about the lifetime and connectivity of objects in the heap. We introduce several kinds of GC assertions, and we describe how they are implemented in the collector. We also describe our reporting mechanism, which provides a complete path through the heap to the offending objects. We show results for one type of assertion that allows the programmer to indicate that an object should be reclaimed at the next GC. We find that using this assertion we can quickly identify a memory leak and its cause with negligible overhead.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115968109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Reasoning about the ARM weakly consistent memory model 关于ARM弱一致内存模型的推理
Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353528
Nathan Chong, Samin S. Ishtiaq
{"title":"Reasoning about the ARM weakly consistent memory model","authors":"Nathan Chong, Samin S. Ishtiaq","doi":"10.1145/1353522.1353528","DOIUrl":"https://doi.org/10.1145/1353522.1353528","url":null,"abstract":"This paper describes a formalization of the ARM weakly consistent memory model: the architectural contract between parallel programs and shared memory multiprocessor implementations. We claim that a clean, unambiguous, and mechanically verifiable specification is a valuable resource for architects, micro-architects and programmers; it allows implementors to forge aggressive static (compiler) and dynamic (JIT, micro-architecture) machines to run code. We discuss the key construct of the ARM memory model, observability -- the order in which memory accesses become visible to processors in a shared memory multiprocessor system -- and examine its use in litmus tests.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121606954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
What can performance counters do for memory subsystem analysis? 性能计数器可以为内存子系统分析做什么?
Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353531
S. Eranian
{"title":"What can performance counters do for memory subsystem analysis?","authors":"S. Eranian","doi":"10.1145/1353522.1353531","DOIUrl":"https://doi.org/10.1145/1353522.1353531","url":null,"abstract":"Nowadays, all major processors provide a set of performance counters which capture micro-architectural level information, such as the number of elapsed cycles, cache misses, or instructions executed. Counters can be found in processor cores, processor die, chipsets, or in I/O cards. They can provide a wealth of information as to how the hardware is being used by software. Many processors now support events to measure precisely and with very limited overhead, the traffic between a core and the memory subsystem. It is possible to compute average load latency and bus band-width utilization. This valuable information can be used to improve code quality and placement of threads to maximize hardware utilization.\u0000 We postulate that performance counters are the key hardware resource to locate and understand issues related to the memory subsystem. In this paper we illustrate our position by showing how certain key memory performance metrics can be gathered easily on today's hardware.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127775042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信