Workshop on Memory System Performance and Correctness最新文献

筛选
英文 中文
Concurrency control with data coloring 具有数据着色的并发控制
Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353525
L. Ceze, C. V. Praun, Calin Cascaval, Pablo Montesinos, J. Torrellas
{"title":"Concurrency control with data coloring","authors":"L. Ceze, C. V. Praun, Calin Cascaval, Pablo Montesinos, J. Torrellas","doi":"10.1145/1353522.1353525","DOIUrl":"https://doi.org/10.1145/1353522.1353525","url":null,"abstract":"Concurrency control is one of the main sources of error and complexity in shared memory parallel programming. While there are several techniques to handle concurrency control such as locks and transactional memory, simplifying concurrency control has proved elusive.\u0000 In this paper we introduce the Data Coloring programming model, based on the principles of our previous work on architecture support for data-centric synchronization. The main idea is to group data structures into consistency domains and mark places in the control flow where data should be consistent. Based on these annotations, the system dynamically infers transaction boundaries. An important aspect of data coloring is that the occurrence of a synchronization defect is typically determinate and leads to a violation of liveness rather than to a safety violation. Finally, this paper includes empirical data that shows that most of the critical sections in large applications are used in a data-centric manner.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126282401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
The potential for variable-granularity access tracking for optimistic parallelism 为乐观并行性提供可变粒度访问跟踪的可能性
Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353527
Mihai Burcea, J. Gregory Steffan, C. Amza
{"title":"The potential for variable-granularity access tracking for optimistic parallelism","authors":"Mihai Burcea, J. Gregory Steffan, C. Amza","doi":"10.1145/1353522.1353527","DOIUrl":"https://doi.org/10.1145/1353522.1353527","url":null,"abstract":"Support for optimistic parallelism such as thread-level speculation (TLS) and transactional memory (TM) has been proposed to ease the task of parallelizing software to exploit the new abundance of multicores. A key requirement for such support is the mechanism for tracking memory accesses so that conflicts between speculative threads or transactions can be detected; existing schemes mainly track accesses at a single fixed granularity---i.e., at the word level, cache-line level, or page level. In this paper we demonstrate, for a hardware implementation of TLS and corresponding speculatively-parallelized SpecINT benchmarks, that the coarsest access tracking granularity that does not incur false violations varies significantly across applications, within applications, and across ranges of memory---from word-size to page size. These results motivate a variable-granularity approach to access tracking, and we show that such an approach can reduce the number of memory ranges that must be tracked and compared to detect conflicts can be reduced by an order of magnitude compared to word-level tracking, without increasing false violations. We are currently developing variable-granularity implementations of both a hardware-based TLS system and an STM system.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"1429 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132670514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
General and efficient locking without blocking 无阻塞的通用高效锁定
Workshop on Memory System Performance and Correctness Pub Date : 2008-03-02 DOI: 10.1145/1353522.1353524
Y. Smaragdakis, Anthony Kay, R. Behrends, M. Young
{"title":"General and efficient locking without blocking","authors":"Y. Smaragdakis, Anthony Kay, R. Behrends, M. Young","doi":"10.1145/1353522.1353524","DOIUrl":"https://doi.org/10.1145/1353522.1353524","url":null,"abstract":"Standard concurrency control mechanisms offer a trade-off: Transactional memory approaches maximize concurrency, but suffer high overheads and cost for retrying in the case of actual contention. Locking offers lower overheads, but typically reduces concurrency due to the difficulty of associating locks with the exact data that need to be accessed. Moreover, locking allows irreversible operations, is ubiquitous in legacy software, and seems unlikely to ever be completely supplanted.\u0000 We believe that the trade-off between transactions and (blocking) locks has not been sufficiently exploited to obtain a \"best of both worlds\" mechanism, although the main components have been identified. Mechanisms for converting locks to atomic sections (which can abort and retry) have already been proposed in the literature: Rajwar and Goodman's \"lock elision\" (at the hardware level) and Welc et al.'s hybrid monitors (at the software level) are the best known representatives. Nevertheless, these approaches admit improvements on both the generality and the performance front. In this position paper we present two ideas. First, we discuss an adaptive criterion for switching from a locking to a transactional implementation, and back to a locking implementation if the transactional one appears to be introducing overhead for no gain in concurrency. Second, we discuss the issues arising when locks are nested. Contrary to assertions in past work, transforming locks into transactions can be incorrect in the presence of nesting. We explain the problem and provide a precise condition for safety.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127420130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Reliability-aware data placement for partial memory protection in embedded processors 嵌入式处理器中部分内存保护的可靠性感知数据放置
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178600
M. Mehrara, T. Austin
{"title":"Reliability-aware data placement for partial memory protection in embedded processors","authors":"M. Mehrara, T. Austin","doi":"10.1145/1178597.1178600","DOIUrl":"https://doi.org/10.1145/1178597.1178600","url":null,"abstract":"Low cost protection of embedded systems against soft errors has recently become a major concern. This issue is even more critical in memory elements that are inherently more prone to transient faults. In this paper, we propose a reliability aware data placement technique in order to partially protect embedded memory systems. We show that by adopting this method instead of traditional placement schemes with complete memory protection, an acceptable level of fault tolerance can be achieved while incurring less area and power overhead. In this approach, each variable in the program is placed in either protected or non-protected memory area according to the profile-driven liveness analysis of all memory variables. In order to measure the level of fault coverage, we inject faults into the memory during the course of program execution in a Monte Carlo simulation framework. Subsequently, we calculate the coverage of partial protection scheme based on the number of protected, failed and crashed runs during the fault injection experiment.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123413034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
What do high-level memory models mean for transactions? 高级内存模型对事务意味着什么?
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178609
D. Grossman, Jeremy Manson, W. Pugh
{"title":"What do high-level memory models mean for transactions?","authors":"D. Grossman, Jeremy Manson, W. Pugh","doi":"10.1145/1178597.1178609","DOIUrl":"https://doi.org/10.1145/1178597.1178609","url":null,"abstract":"Many people have proposed adding transactions, or atomic blocks, to type-safe high-level programming languages. However, researchers have not considered the semantics of transactions with respect to a memory model weaker than sequential consistency. The details of such semantics are more subtle than many people realize, and the interaction between compiler transformations and transactions could produce behaviors that many people find surprising. A language's memory model, which determines these interactions, must clearly indicate which behaviors are legal, and which are not. These design decisions affect both the idioms that are useful for designing concurrent software and the compiler transformations that are legal within the language.Cases where semantics are more subtle than people expect include the actual meaning of both strong and weak atomicity; correct idioms for thread safe lazy initialization; compiler transformations of transactions that touch only thread local memory; and whether there is a well-defined notion for transactions that corresponds to the notion of correct and incorrect use of synchronization in Java. Open questions for a high-level memory-model that includes transactions involve both issues of isolation and ordering.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126873471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Keynote talk challenges in chip multiprocessor memory systems 主题演讲:芯片多处理器存储系统的挑战
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178607
D. Wood
{"title":"Keynote talk challenges in chip multiprocessor memory systems","authors":"D. Wood","doi":"10.1145/1178597.1178607","DOIUrl":"https://doi.org/10.1145/1178597.1178607","url":null,"abstract":"The semiconductor industry appears on the brink of an arms race, competing to see which company can cram the most cores on a single die. Yet early entries are hardly well-balanced, general-purpose computers: Sun's Niagara has feeble floating-point performance and IBM's Cell processor is a thinly disguised GPU. Worse, no company has announced anything that will help address the real problem: programming the multithreaded applications needed to exploit the ample computational resources. This talk will discuss several challenges facing the memory system designers of emerging computation-rich chip multiprocessors.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125302378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems 对提高嵌入式系统上java应用程序的TLB性能的硬件/软件方法的综合研究
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178614
Jinzhan Peng, Guei-Yuan Lueh, Gansha Wu, Xiaogang Gou, R. Rakvic
{"title":"A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems","authors":"Jinzhan Peng, Guei-Yuan Lueh, Gansha Wu, Xiaogang Gou, R. Rakvic","doi":"10.1145/1178597.1178614","DOIUrl":"https://doi.org/10.1145/1178597.1178614","url":null,"abstract":"The working set size of Java applications on embedded systems has recently been increasing, causing the Translation Lookaside Buffer (TLB) to become a serious performance bottleneck. From a thorough analysis of the SPECjvm98 benchmark suite executing on a commodity embedded system, we find TLB misses attribute from 24% to 50% of the total execution time. We explore and evaluate a wide spectrum of TLB-enhancing techniques with different combinations of software/hardware approaches, namely superpage for reducing TLB miss rates, two-level TLB and TLB prefetching for reducing both TLB miss rates and TLB miss latency, and even a no-TLB design for removing TLB overhead completely. We adapt and then in a novel way extend these approaches to fit the design space of embedded systems executing Java code. We compare these approaches, discussing their performance behavior, software/hardware complexity and constraints, especially the design implications for the application, runtime and OS.We first conclude that even with the aggressive approaches presented, there remains a performance bottleneck with the TLB. Second, in addition to facing very different design considerations and constraints for embedded systems, proven hardware techniques, such as TLB prefetching have different performance implications. Third, software based solutions, no-TLB design and superpaging, appear to be more effective in improving Java application performance on embedded systems. Finally, beyond performance, these approaches have their respective pros and cons; it is left to the system designer to make the appropriate engineering tradeoff.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129626911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Implicit and explicit optimizations for stencil computations 模板计算的隐式和显式优化
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178605
S. Kamil, K. Datta, Samuel Williams, L. Oliker, J. Shalf, K. Yelick
{"title":"Implicit and explicit optimizations for stencil computations","authors":"S. Kamil, K. Datta, Samuel Williams, L. Oliker, J. Shalf, K. Yelick","doi":"10.1145/1178597.1178605","DOIUrl":"https://doi.org/10.1145/1178597.1178605","url":null,"abstract":"Stencil-based kernels constitute the core of many scientific applications on block-structured grids. Unfortunately, these codes achieve a low fraction of peak performance, due primarily to the disparity between processor and main memory speeds. We examine several optimizations on both the conventional cache-based memory systems of the Itanium 2, Opteron, and Power5, as well as the heterogeneous multicore design of the Cell processor. The optimizations target cache reuse across stencil sweeps, including both an implicit cache oblivious approach and a cache-aware algorithm blocked to match the cache structure. Finally, we consider stencil computations on a machine with an explicitly-managed memory hierarchy, the Cell processor. Overall, results show that a cache-aware approach is significantly faster than a cache oblivious approach and that the explicitly managed memory on Cell is more efficient: Relative to the Power5, it has almost 2x more memory bandwidth and is 3.7x faster.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121990913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 153
Atomicity via source-to-source translation 通过源到源转换实现原子性
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178611
Benjamin Hindman, D. Grossman
{"title":"Atomicity via source-to-source translation","authors":"Benjamin Hindman, D. Grossman","doi":"10.1145/1178597.1178611","DOIUrl":"https://doi.org/10.1145/1178597.1178611","url":null,"abstract":"We present an implementation and evaluation of atomicity (also known as software transactions) for a dialect of Java. Our implementation is fundamentally different from prior work in three respects: (1) It is entirely a source-to-source translation, producing Java source code that can be compiled by any Java compiler and run on any Java Virtual Machine. (2) It can enforce \"strong\" atomicity without assuming special hardware or a uniprocessor. (3) The implementation uses locks rather than optimistic concurrency, but it cannot deadlock and requires inter-thread communication only when there is data contention.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123871248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
Memory models for open-nested transactions 开放嵌套事务的内存模型
Workshop on Memory System Performance and Correctness Pub Date : 2006-10-22 DOI: 10.1145/1178597.1178610
Kunal Agrawal, C. Leiserson, Jim Sukha
{"title":"Memory models for open-nested transactions","authors":"Kunal Agrawal, C. Leiserson, Jim Sukha","doi":"10.1145/1178597.1178610","DOIUrl":"https://doi.org/10.1145/1178597.1178610","url":null,"abstract":"Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed hardware schemes to support open nesting. Since these researchers have described their schemes using only operational definitions, however, the semantics of these systems have not been specified in an implementation-independent way. This paper offers a framework for defining and exploring the memory semantics of open nesting in a transactionl-memory setting.Our framework allows us to define the traditional model of serializability and two new transactional-memory models, race freedom and prefix race freedom. The weakest of these memory models, prefix race freedom, closely resembles the Stanford openesting model. We prove that these three memory models are equivalent for transactional-memory systems that support only closed nesting, as long as aborted transactions are \"ignored.\" We prove that for systems that support open nesting, however, the models of serializability, race freedom, and prefix race freedom are distinct. We show that the Stanford TM system implements a model at least as strong as prefix race freedom and strictly weaker than race freedom. Thus, their model compromises serializability, the property traditionally used to reason about the correctness of transactions.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129537382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信