Performance implications of fence-based memory models

Workshop on Memory System Performance and Correctness Pub Date : 2011-06-05 DOI:10.1145/1988915.1988919

H. Boehm

{"title":"Performance implications of fence-based memory models","authors":"H. Boehm","doi":"10.1145/1988915.1988919","DOIUrl":null,"url":null,"abstract":"Most mainstream shared-memory parallel programming languages are converging to a memory model, or shared variable semantics, centered on providing sequential consistency for most data-race-free programs.\n OpenMP, along with a small number of other languages, defines its memory model in terms of implicit fence (e.g. OpenMP flush) operations that force memory accesses to become visible to other threads in order. Synchronization operations provided by the language implicitly include such fences. In the simplest cases this is equivalent to a promise of sequential consistency for data-race-free programs.\n However, real languages typically also provide atomic operations with weak memory ordering constraints, such as the OpenMP atomic directives. These break the above equivalence, making the fence-based model stronger in ways that are observable, but not generally useful. As a result, conventional lock implementations are often accidentally prohibited, adding significant overhead for uncontended locks.\n We show that this problem affects both OpenMP and, in a more subtle way, UPC. We have been working with the OpenMP ARB to resolve these issues in future versions of OpenMP.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Memory System Performance and Correctness","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1988915.1988919","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Most mainstream shared-memory parallel programming languages are converging to a memory model, or shared variable semantics, centered on providing sequential consistency for most data-race-free programs. OpenMP, along with a small number of other languages, defines its memory model in terms of implicit fence (e.g. OpenMP flush) operations that force memory accesses to become visible to other threads in order. Synchronization operations provided by the language implicitly include such fences. In the simplest cases this is equivalent to a promise of sequential consistency for data-race-free programs. However, real languages typically also provide atomic operations with weak memory ordering constraints, such as the OpenMP atomic directives. These break the above equivalence, making the fence-based model stronger in ways that are observable, but not generally useful. As a result, conventional lock implementations are often accidentally prohibited, adding significant overhead for uncontended locks. We show that this problem affects both OpenMP and, in a more subtle way, UPC. We have been working with the OpenMP ARB to resolve these issues in future versions of OpenMP.

查看原文本刊更多论文

基于栅栏的内存模型的性能含义

大多数主流的共享内存并行编程语言都在向内存模型或共享变量语义靠拢，其核心是为大多数无数据竞争的程序提供顺序一致性。OpenMP和少数其他语言一起，根据隐式栅栏(例如OpenMP flush)操作来定义其内存模型，这些操作强制内存访问顺序对其他线程可见。该语言提供的同步操作隐式地包括这样的隔离。在最简单的情况下，这相当于对无数据竞争的程序的顺序一致性的承诺。但是，实际语言通常还提供具有弱内存排序约束的原子操作，例如OpenMP原子指令。它们打破了上述等价性，使基于栅栏的模型以可观察的方式变得更强，但通常不是有用的。因此，传统的锁实现经常被意外地禁止，为非争用锁增加了显著的开销。我们表明，这个问题既影响OpenMP，也以一种更微妙的方式影响UPC。我们一直在与OpenMP ARB合作，在未来版本的OpenMP中解决这些问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Workshop on Memory System Performance and Correctness

自引率

0.00%

发文量