Workshop on Memory System Performance and Correctness最新文献_第5页

Concurrency control with data coloring 具有数据着色的并发控制

Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353525

L. Ceze, C. V. Praun, Calin Cascaval, Pablo Montesinos, J. Torrellas

引用次数: 21

The potential for variable-granularity access tracking for optimistic parallelism 为乐观并行性提供可变粒度访问跟踪的可能性

Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353527

Mihai Burcea, J. Gregory Steffan, C. Amza

{"title":"The potential for variable-granularity access tracking for optimistic parallelism","authors":"Mihai Burcea, J. Gregory Steffan, C. Amza","doi":"10.1145/1353522.1353527","DOIUrl":"https://doi.org/10.1145/1353522.1353527","url":null,"abstract":"Support for optimistic parallelism such as thread-level speculation (TLS) and transactional memory (TM) has been proposed to ease the task of parallelizing software to exploit the new abundance of multicores. A key requirement for such support is the mechanism for tracking memory accesses so that conflicts between speculative threads or transactions can be detected; existing schemes mainly track accesses at a single fixed granularity---i.e., at the word level, cache-line level, or page level. In this paper we demonstrate, for a hardware implementation of TLS and corresponding speculatively-parallelized SpecINT benchmarks, that the coarsest access tracking granularity that does not incur false violations varies significantly across applications, within applications, and across ranges of memory---from word-size to page size. These results motivate a variable-granularity approach to access tracking, and we show that such an approach can reduce the number of memory ranges that must be tracked and compared to detect conflicts can be reduced by an order of magnitude compared to word-level tracking, without increasing false violations. We are currently developing variable-granularity implementations of both a hardware-based TLS system and an STM system.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"1429 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132670514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

General and efficient locking without blocking 无阻塞的通用高效锁定

Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353524

Y. Smaragdakis, Anthony Kay, R. Behrends, M. Young

{"title":"General and efficient locking without blocking","authors":"Y. Smaragdakis, Anthony Kay, R. Behrends, M. Young","doi":"10.1145/1353522.1353524","DOIUrl":"https://doi.org/10.1145/1353522.1353524","url":null,"abstract":"Standard concurrency control mechanisms offer a trade-off: Transactional memory approaches maximize concurrency, but suffer high overheads and cost for retrying in the case of actual contention. Locking offers lower overheads, but typically reduces concurrency due to the difficulty of associating locks with the exact data that need to be accessed. Moreover, locking allows irreversible operations, is ubiquitous in legacy software, and seems unlikely to ever be completely supplanted.\u0000 We believe that the trade-off between transactions and (blocking) locks has not been sufficiently exploited to obtain a \"best of both worlds\" mechanism, although the main components have been identified. Mechanisms for converting locks to atomic sections (which can abort and retry) have already been proposed in the literature: Rajwar and Goodman's \"lock elision\" (at the hardware level) and Welc et al.'s hybrid monitors (at the software level) are the best known representatives. Nevertheless, these approaches admit improvements on both the generality and the performance front. In this position paper we present two ideas. First, we discuss an adaptive criterion for switching from a locking to a transactional implementation, and back to a locking implementation if the transactional one appears to be introducing overhead for no gain in concurrency. Second, we discuss the issues arising when locks are nested. Contrary to assertions in past work, transforming locks into transactions can be incorrect in the presence of nesting. We explain the problem and provide a precise condition for safety.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127420130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Reliability-aware data placement for partial memory protection in embedded processors 嵌入式处理器中部分内存保护的可靠性感知数据放置

Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178600

M. Mehrara, T. Austin

引用次数: 3

What do high-level memory models mean for transactions? 高级内存模型对事务意味着什么?

Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178609

D. Grossman, Jeremy Manson, W. Pugh

{"title":"What do high-level memory models mean for transactions?","authors":"D. Grossman, Jeremy Manson, W. Pugh","doi":"10.1145/1178597.1178609","DOIUrl":"https://doi.org/10.1145/1178597.1178609","url":null,"abstract":"Many people have proposed adding transactions, or atomic blocks, to type-safe high-level programming languages. However, researchers have not considered the semantics of transactions with respect to a memory model weaker than sequential consistency. The details of such semantics are more subtle than many people realize, and the interaction between compiler transformations and transactions could produce behaviors that many people find surprising. A language's memory model, which determines these interactions, must clearly indicate which behaviors are legal, and which are not. These design decisions affect both the idioms that are useful for designing concurrent software and the compiler transformations that are legal within the language.Cases where semantics are more subtle than people expect include the actual meaning of both strong and weak atomicity; correct idioms for thread safe lazy initialization; compiler transformations of transactions that touch only thread local memory; and whether there is a well-defined notion for transactions that corresponds to the notion of correct and incorrect use of synchronization in Java. Open questions for a high-level memory-model that includes transactions involve both issues of isolation and ordering.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126873471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63

Keynote talk challenges in chip multiprocessor memory systems 主题演讲:芯片多处理器存储系统的挑战

Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178607

D. Wood

引用次数: 0

Implicit and explicit optimizations for stencil computations 模板计算的隐式和显式优化

Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178605

S. Kamil, K. Datta, Samuel Williams, L. Oliker, J. Shalf, K. Yelick

引用次数: 153

A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems 对提高嵌入式系统上java应用程序的TLB性能的硬件/软件方法的综合研究

Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178614

Jinzhan Peng, Guei-Yuan Lueh, Gansha Wu, Xiaogang Gou, R. Rakvic

{"title":"A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems","authors":"Jinzhan Peng, Guei-Yuan Lueh, Gansha Wu, Xiaogang Gou, R. Rakvic","doi":"10.1145/1178597.1178614","DOIUrl":"https://doi.org/10.1145/1178597.1178614","url":null,"abstract":"The working set size of Java applications on embedded systems has recently been increasing, causing the Translation Lookaside Buffer (TLB) to become a serious performance bottleneck. From a thorough analysis of the SPECjvm98 benchmark suite executing on a commodity embedded system, we find TLB misses attribute from 24% to 50% of the total execution time. We explore and evaluate a wide spectrum of TLB-enhancing techniques with different combinations of software/hardware approaches, namely superpage for reducing TLB miss rates, two-level TLB and TLB prefetching for reducing both TLB miss rates and TLB miss latency, and even a no-TLB design for removing TLB overhead completely. We adapt and then in a novel way extend these approaches to fit the design space of embedded systems executing Java code. We compare these approaches, discussing their performance behavior, software/hardware complexity and constraints, especially the design implications for the application, runtime and OS.We first conclude that even with the aggressive approaches presented, there remains a performance bottleneck with the TLB. Second, in addition to facing very different design considerations and constraints for embedded systems, proven hardware techniques, such as TLB prefetching have different performance implications. Third, software based solutions, no-TLB design and superpaging, appear to be more effective in improving Java application performance on embedded systems. Finally, beyond performance, these approaches have their respective pros and cons; it is left to the system designer to make the appropriate engineering tradeoff.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129626911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Atomicity via source-to-source translation 通过源到源转换实现原子性

Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178611

Benjamin Hindman, D. Grossman

引用次数: 87

Memory models for open-nested transactions 开放嵌套事务的内存模型

Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178610

Kunal Agrawal, C. Leiserson, Jim Sukha

{"title":"Memory models for open-nested transactions","authors":"Kunal Agrawal, C. Leiserson, Jim Sukha","doi":"10.1145/1178597.1178610","DOIUrl":"https://doi.org/10.1145/1178597.1178610","url":null,"abstract":"Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed hardware schemes to support open nesting. Since these researchers have described their schemes using only operational definitions, however, the semantics of these systems have not been specified in an implementation-independent way. This paper offers a framework for defining and exploring the memory semantics of open nesting in a transactionl-memory setting.Our framework allows us to define the traditional model of serializability and two new transactional-memory models, race freedom and prefix race freedom. The weakest of these memory models, prefix race freedom, closely resembles the Stanford openesting model. We prove that these three memory models are equivalent for transactional-memory systems that support only closed nesting, as long as aborted transactions are \"ignored.\" We prove that for systems that support open nesting, however, the models of serializability, race freedom, and prefix race freedom are distinct. We show that the Stanford TM system implements a model at least as strong as prefix race freedom and strictly weaker than race freedom. Thus, their model compromises serializability, the property traditionally used to reason about the correctness of transactions.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129537382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30